Optimization of Complex Systems: Theory, Models, Algorithms and Applications

Advances in Intelligent Systems and Computing 991
Hoai An Le Thi
Hoai Minh Le
Tao Pham Dinh Editors
Optimization of
Complex Systems:
Theory, Models,
Algorithms and
Applications
Advances in Intelligent Systems and Computing
Volume 991
Series Editor
Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences,
Warsaw, Poland
Advisory Editors
Nikhil R. Pal, Indian Statistical Institute, Kolkata, India
Rafael Bello Perez, Faculty of Mathematics, Physics and Computing, Universidad
Central de Las Villas, Santa Clara, Cuba
Emilio S. Corchado, University of Salamanca, Salamanca, Spain
Hani Hagras, School of Computer Science & Electronic Engineering, University of
Essex, Colchester, UK
László T. Kóczy, Department of Automation, Széchenyi István University, Gyor,
Hungary
Vladik Kreinovich, Department of Computer Science, University of Texas at El
Paso, El Paso, TX, USA
Chin-Teng Lin, Department of Electrical Engineering, National Chiao Tung
University, Hsinchu, Taiwan
Jie Lu, Faculty of Engineering and Information Technology, University of
Technology Sydney, Sydney, NSW, Australia
Patricia Melin, Graduate Program of Computer Science, Tijuana Institute of
Technology, Tijuana, Mexico
Nadia Nedjah, Department of Electronics Engineering, University of Rio de
Janeiro, Rio de Janeiro, Brazil
Ngoc Thanh Nguyen, Faculty of Computer Science and Management, Wrocław
University of Technology, Wrocław, Poland
Jun Wang, Department of Mechanical and Automation Engineering, The Chinese
University of Hong Kong, Shatin, Hong Kong
The series “Advances in Intelligent Systems and Computing” contains publications
on theory, applications, and design methods of Intelligent Systems and Intelligent
Computing. Virtually all disciplines such as engineering, natural sciences, computer
and information science, ICT, economics, business, e-commerce, environment,
healthcare, life science are covered. The list of topics spans all the areas of modern
intelligent systems and computing such as: computational intelligence, soft comput-
ing including neural networks, fuzzy systems, evolutionary computing and the fusion
of these paradigms, social intelligence, ambient intelligence, computational neuro-
science, artificial life, virtual worlds and society, cognitive science and systems,
Perception and Vision, DNA and immune based systems, self-organizing and
adaptive systems, e-Learning and teaching, human-centered and human-centric
computing, recommender systems, intelligent control, robotics and mechatronics
including human-machine teaming, knowledge-based paradigms, learning para-
digms, machine ethics, intelligent data analysis, knowledge management, intelligent
agents, intelligent decision making and support, intelligent network security, trust
management, interactive entertainment, Web intelligence and multimedia.
The publications within “Advances in Intelligent Systems and Computing” are
primarily proceedings of important conferences, symposia and congresses. They
cover significant recent developments in the field, both of a foundational and
applicable character. An important characteristic feature of the series is the short
publication time and world-wide distribution. This permits a rapid and broad
dissemination of research results.
** Indexing: The books of this series are submitted to ISI Proceedings,
EI-Compendex, DBLP, SCOPUS, Google Scholar and Springerlink **
More information about this series at http://www.springer.com/series/11156

Hoai An Le Thi Hoai Minh Le
• •
Tao Pham Dinh

Editors
Optimization of Complex
Systems: Theory, Models,
Algorithms and Applications
123
Editors
Hoai An Le Thi Hoai Minh Le
Computer science and Applications Computer Science and Applications
Department Department
LGIPM, University of Lorraine LGIPM, University of Lorraine
Metz Cedex 03, France Metz Cedex 03, France
Tao Pham Dinh

Laboratory of Mathematics
National Institute for Applied Sciences
(INSA)-Rouen Normadie
Saint-Étienne-du-Rouvray Cedex, France
ISSN 2194-5357 ISSN 2194-5365 (electronic)

Advances in Intelligent Systems and Computing
ISBN 978-3-030-21802-7 ISBN 978-3-030-21803-4 (eBook)
https://doi.org/10.1007/978-3-030-21803-4
© Springer Nature Switzerland AG 2020
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, expressed or implied, with respect to the material contained
herein or for any errors or omissions that may have been made. The publisher remains neutral with regard
to jurisdictional claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
WCGO 2019 was the sixth event in the series of World Congress on Global
Optimization conferences, and it took place on July 8–10, 2019 at Metz, France.
The conference aims to bring together most leading specialists in both theoretical
and algorithmic aspects as well as a variety of application domains of nonconvex
programming and global optimization to highlight recent advances, trends, chal-
lenges and discuss how to expand the role of these fields in several potential
high-impact application areas.
The WCGO conference series is a biennial conference of the International
Society of Global Optimization (iSoGO). The first event WCGO 2009 took place in
Hunan, China. The second event, WCGO 2011, was held in Chania, Greece, fol-
lowed by the third event, WCGO 2013, in Huangshan, China. The fourth event,
WCGO 2015, took place in Florida, USA, while the fifth event was held in Texas,
USA. One of the highlights of this biannual meeting is the announcement of
Constantin Carathéodory Prize of iSoGO awarded in recognition of lifetime con-
tributions to the field of global optimization.
WCGO 2019 was attended by about 180 scientists and practitioners from 40
countries. The scientific program includes the oral presentation of 112 selected full
papers as well as several selected abstracts covering all main topic areas. In addi-
tion, the conference program was enriched by six plenary lectures that were given
by Prof. Aharon Ben-Tal (Israel Institute of Technology, Israel), Prof. Immanuel M.
Bomze (University of Vienna, Austria), Prof. Masao Fukushima (Nanzan
University, Japan), Prof. Anna Nagurney (University of Massachusetts Amherst,
USA), Prof. Panos M. Pardalos (University of Florida, USA), and Prof. Anatoly
Zhigljavsky (Cardiff University, UK).
This book contains 112 papers selected from about 250 submissions to WCGO
2019. Each paper was peer-reviewed by at least two members of the International
Program Committee and the International Reviewer Board. The book covers both
theoretical and algorithmic aspects of nonconvex programming and global opti-
mization, as well as its applications to modeling and solving decision problems in
various domains. The book is composed of ten parts, and each of them deals with
either the theory and/or methods in a branch of optimization such as continuous
v
vi Preface
optimization, DC programming and DCA, discrete optimization and network

optimization, multiobjective programming, optimization under uncertainty, or
models and optimization methods in a specific application area including data
science, economics and finance, energy and water management, engineering sys-
tems, transportation, logistics, resource allocation and production management. We
hope that the researchers and practitioners working in nonconvex optimization and
several application areas can find here many inspiring ideas and useful tools and
techniques for their works.
We would like to thank the chairs and the members of International Program
Committee as well as the reviewers for their hard work in the review process, which
helped us to guarantee the highest quality of the selected papers for the conference.
We cordially thank the organizers and chairs of special sessions for their contri-
butions to the success of the conference. Thanks are also due to the plenary lec-
turers for their interesting and informative talks of a world-class standard.
The conference was organized by the Computer Science and Applications
Department, LGIPM, University of Lorraine, France. We wish to especially thank
all members of the Organizing Committee for their excellent work to make the
conference a success. The conference would not have been possible without their
considerable effort.
We would like to express our sincere thanks to our main sponsors: Réseau de
Transport d’Électricité (France), Conseil régional du Grand Est (France), Metz
Métropole (France), Conseil départemental de la Moselle (France), Université de
Lorraine (France), Laboratoire de Génie Informatique, de Production et de
Maintenance (LGIPM) - Université de Lorraine, UFR Mathématique Informatique
Mécanique Automatique - Université de Lorraine, and DCA Solutions (Vietnam).
Our special thanks go to all the authors for their valuable contributions, and to
the other participants who enriched the conference success.
Finally, we cordially thank Springer for their help in publishing this book.
July 2019 Hoai An Le Thi

Hoai Minh Le
Tao Pham Dinh
Organization
WCGO 2019 was organized by the Computer Science and Applications Department,
LGIPM, University of Lorraine, France.
Conference Chair
Hoai An Le Thi University of Lorraine, France
Program Chairs
Hoai An Le Thi University of Lorraine, France

Tao Pham Dinh National Institute for Applied Sciences - Rouen
Normandie, France
Yaroslav D. Sergeyev University of Calabria, Italy
Publicity Chair
Hoai Minh Le University of Lorraine, France
vii
viii Organization
International Program Committee Members
Paula Amaral University NOVA de Lisboa, Portugal

Adil Bagirov Federation University, Australia
Balabhaskar Oklahoma State University, USA
Balasundaram
Paul I. Barton Massachusetts Institute of Technology, USA
Aharon Ben-Tal Technion - Israel Institute of Technology
Immanuel M. Bomze University of Vienna, Austria
Radu Ioan Bot University of Vienna, Austria
Sergiy Butenko Texas A&M University Engineering, USA
Stéphane Canu National Institute for Applied Sciences-Rouen, France
Emilio Carrizosa University de Seville, Spain
Leocadio-G. Casado University de Almería, Spain
Tibor Csendes University of Szeged, Hungary
Yu-Hong Dai Chinese Academy of Sciences, China
Gianni Di Pillo University Rome La Sapienza, Italy
Ding-zhu Du University of Texas at Dallas, USA
Matthias Ehrgott University of Auckland, Australia
Shu-Cherng Fang North Carolina State University, USA
José Fernández- University de Murcia, Spain
Hernández
Dalila B. M. M. University of Porto, Portugal
Fontes
Masao Fukushima Nanzan University, Japan
Vladimir Grishagin N.I. Lobachevsky State University of Nizhny Novgorod,
Russia
Ignacio E. Carnegie Mellon University, USA
Grossmann
Yann Guermeur LORIA, France
Mounir Haddou National Institute for Applied Sciences-Rennes, France
Milan Hladík Charles University, Czech Republic
Joaquim Judice University Coimbra, Portugal
Oleg Khamisov Energy Systems Institute, Russian Academy of Sciences,
Irkutsk, Russia
Diethard Klatte University of Zurich, Switzerland
Pavlo Krokhmal University of Arizona, USA
Dmitri Kvasov University of Calabria, Italy
Carlile Lavor University of Campinas, Brazil
Dung Muu Le Institute of Mathematics, Hanoi, Vietnam
Gue Myung Lee Pukyong National University, Korea
Organization ix
Jon Lee University of Michigan, USA

Vincent Lefieux RTE, France
Duan Li City University of Hong Kong, Hong Kong, China
Leo Liberti Ecole Polytechnique, France
Hsuan-Tien Lin National Taiwan University, Taiwan
Abdel Lisser Paris-Sud University, France
Angelo Lucia University of Rhode Island, USA
Stefano Lucidi University Roma “La Sapienza,” Italia
Andreas Lundell Abo Akademi University in Turku, Finland
Lina Mallozzi University of Naples Federico II, Italy
Pierre Maréchal University of Toulouse - Paul Sabatier, France
Kaisa Miettinen University of Jyvaskyla, Finland
Michel Minoux Sorbonne University, France
Shashi Kant Mishra Banaras Hindu University, India
Dolores Romero Copenhagen Business School, Denmark
Morales
Ngoc Thanh Nguyen Wroclaw University of Science and Technology, Poland
Viet Hung Nguyen Sorbonne University, France
Yi-Shuai Niu Shanghai Jiao Tong University, China
Ivo Nowak Hamburg University of Applied Sciences, Germany
Jong-Shi Pang University of Southern California, USA
Panos Pardalos University of Florida, USA
Hoang Pham Rutgers University, USA
Janos D. Pinter Lehigh University, USA
Efstratios N. Texas A&M University, USA
Pistikopoulos
Oleg Prokopyev University of Pittsburgh, USA
Stefan Ratschan Academy of Sciences of the Czech Republic, Czech
Republic
Steffen Rebennack Karlsruhe Institute of Technology, Germany
Franz Rendl University Klagenfurt, Austria
Ana Maria Rocha University of Minho, Braga, Portugal
Ruey-Lin Sheu National Cheng-Kung University, Taiwan
Jianming Shi Tokyo University of Science, Japan
Christine A. National University of Singapore
Shoemaker
Eduardo Souza De National Institute for Applied Sciences - Rouen, France
Cursi
Alexander Russian Academy of Sciences, Irkutsk, Russia
Strekalovsky
Jie Sun Curtin University, Australia
Akiko Takeda University of Tokyo, Japan
Michael Ulbrich Technical University of Munich, Germany
Luis Nunes Vicente Lehigh University, USA
Stefan Vigerske Zuse Institute Berlin, Germany
x Organization
Gerhard-Wilhelm Poznan University of Technology, Poland

Weber
Yichao Wu University of Illinois at Chicago, USA
Jack Xin University of California, Irvine, USA
Fengqi You Cornell University, USA
Wuyi Yue Konan University, Japan
Ahmed Zidna University of Lorraine, France
Antanas Zilinskas Vilnius University, Lithuania
External Reviewers
Manuel University of Cádiz, Spain

Arana-Jimenez
Abdessamad Amir University of Mostaganem, Algeria
Domingo Barrera University of Granada, Spain
Victor Blanco University of Granada, Spain
Miguel A. Fortes University of Granada, Spain
Olivier Gallay University of Lausanne, Switzerland
Pedro University of Granada, Spain
González-Rodelas
Luo Hezhi Zhejiang University of Technology, China
Vinh Thanh Ho University of Lorraine, France
Baktagul Imasheva International University of Information Technology,
Kazakhstan
Amodeo Lionel University of Technology of Troyes, France
Aiman Moldagulova Al-Farabi Kazakh National University, Kazakhstan
Samat Mukhanov International University of Information Technology,
Kazakhstan
Canh Nam Nguyen Hanoi University of Science and Technology, Vietnam
Duc Manh Nguyen Hanoi National University of Education, Vietnam
Manh Cuong Hanoi University of Industry, Vietnam
Nguyen
Thi Bich Thuy VNU University of Science, Vietnam
Nguyen
Thi Minh Tam Vietnam National University of Agriculture, Vietnam
Nguyen
Viet Anh Nguyen University of Lorraine, France
Miguel Pasadas University of Granada, Spain
Thi Hoai Pham Hanoi University of Science and Technology, Vietnam
Duy Nhat Phan University of Lorraine, France
Jakob Puchinger Paris-Saclay University, France
Lopez Rafael Federal University of Santa Catarina, Brazil
Organization xi
Sabina International University of Information Technology,

Rakhmetulayeva Kazakhstan
Hagen Salewski University of Kaiserslautern, Germany
Daniel Schermer University of Kaiserslautern, Germany
Ryskhan International University of Information Technology,
Satybaldiyeva Kazakhstan
Bach Tran University of Lorraine, France
Thi Thuy Tran FPT University, Vietnam
Yong Xia Beihang University, China
Xuan Thanh Vo Ho Chi Minh City University of Science, Vietnam
Baiyi Wu Guangdong University of Foreign Studies
Xiaojin Zheng Tongji University, China
Plenary Lecturers
Aharon Ben-Tal Israel Institute of Technology, Israel

Immanuel M. Bomze University of Vienna, Austria
Masao Fukushima Nanzan University, Japan
Anna Nagurney University of Massachusetts Amherst, USA
Panos M. Pardalos University of Florida, USA
Anatoly Zhigljavsky Cardiff University, UK
Special Session Organizers
1. Combinatorial Optimization: Viet Hung Nguyen (Sorbonne University, France),

Kim Thang Nguyen (Paris-Saclay University, France), and Ha Duong Phan
(Institute of Mathematics, Vietnam)
2. Recent Advances in DC programming and DCA: Theory, Algorithms and
Applications: Hoai An Le Thi and Hoai Minh Le (University of Lorraine,
France)
3. Mixed-Integer Optimization: Yi-Shuai Niu (Shanghai Jiao Tong University,
China)
4. Quadratically constrained quadratic programming & QCQP: Duan Li (City
University of Hong Kong, Hong Kong, China) and Rujun Jiang (Fudan
University, Shanghai, China)
5. Uncertainty Quantification and Optimization: Eduardo Souza de Cursi
(National Institute for Applied Sciences - Rouen, France) and Rafael Holdorf
(Federal University of Santa Catarina, Brazil)
6. Computing, Engineering and Data Science: Raissa Uskenbayeva and Sabina
Rakhmetulayeva (International Information Technology University,
Kazakhstan)
xii Organization
7. Complementarity Problems: Applications, Theory and Algorithms: Mounir

Haddou (National Institute for Applied Sciences - Rennes, France), Ibtihel Ben
Gharbia, and Quang Huy Tran (IFP Energies nouvelles, France)
8. Optimization Methods under Uncertainty: Manuel Arana Jimenez (University
of Cádiz, Spain)
9. Spline Approximation & Optimization with Applications: Ahmed Zidna and
Dominique Michel (University of Lorraine, France)
10. Surrogate Global Optimization for Expensive Multimodal Functions: Christine
Shoemaker (National University of Singapore, Singapore)
11. Novel Technologies and Optimization for Last-Mile Logistics: Mahdi Moeini
and Hagen Salewski (Technische Universität Kaiserslautern, Germany)
12. Sustainable Supply Chains and Logistics Networks: Daniel Roy and Sophie
Hennequin (University of Lorraine, France)
Organizing Committee Members

Vinh Thanh Ho University of Lorraine, France
Bach Tran University of Lorraine, France
Viet Anh Nguyen University of Lorraine, France
Aurélie Lallemand University of Lorraine, France
Sponsoring Institutions
Réseau de Transport d’Électricité, France

Conseil régional du Grand Est, France
Metz Métropole, France
Conseil départemental de la Moselle, France
Université de Lorraine, France
Laboratoire de Génie Informatique, de Production et de Maintenance (LGIPM) -
Université de Lorraine
UFR Mathématique Informatique Mécanique Automatique - Université de Lorraine
DCA Solutions, Vietnam
Springer
Contents
Continuous Optimization
A Hybrid Simplex Search for Global Optimization with
Representation Formula and Genetic Algorithm . . . . . . . . . . . . . . . . . 3
Hafid Zidani, Rachid Ellaia, and Eduardo Souza de Cursi
A Population-Based Stochastic Coordinate Descent Method . . . . . . . . . 16
Ana Maria A. C. Rocha, M. Fernanda P. Costa, and Edite M.
G. P. Fernandes
A Sequential Linear Programming Algorithm for Continuous
and Mixed-Integer Nonconvex Quadratic Programming . . . . . . . . . . . 26
Mohand Bentobache, Mohamed Telli, and Abdelkader Mokhtari
A Survey of Surrogate Approaches for Expensive Constrained
Black-Box Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Rommel G. Regis
Adaptive Global Optimization Based on Nested Dimensionality
Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
Konstantin Barkalov and Ilya Lebedev
A B-Spline Global Optimization Algorithm for Optimal Power
Flow Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Deepak D. Gawali, Bhagyesh V. Patil, Ahmed Zidna,
and Paluri S. V. Nataraj
Concurrent Topological Optimization of a Multi-component Arm
for a Tube Bending Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Federico Ballo, Massimiliano Gobbi, and Giorgio Previati
Discrete Interval Adjoints in Unconstrained Global Optimization . . . . 78
Jens Deussen and Uwe Naumann
xiii
xiv Contents
Diving for Sparse Partially-Reflexive Generalized Inverses . . . . . . . . . . 89

Victor K. Fuentes, Marcia Fampa, and Jon Lee
Filtering Domains of Factorable Functions Using Interval
Contractors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
Laurent Granvilliers
Leveraging Local Optima Network Properties for Memetic
Differential Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
Viktor Homolya and Tamás Vinkó
Maximization of a Convex Quadratic Form on a Polytope:
Factorization and the Chebyshev Norm Bounds . . . . . . . . . . . . . . . . . . 119
Milan Hladík and David Hartman
New Dynamic Programming Approach to Global Optimization . . . . . . 128
Anna Kaźmierczak and Andrzej Nowakowski
On Chebyshev Center of the Intersection of Two Ellipsoids . . . . . . . . . 135
Xiaoli Cen, Yong Xia, Runxuan Gao, and Tianzhi Yang
On Conic Relaxations of Generalization of the Extended Trust
Region Subproblem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Rujun Jiang and Duan Li
On Constrained Optimization Problems Solved Using
the Canonical Duality Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
Constantin Zălinescu
On Controlled Variational Inequalities Involving Convex
Functionals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
Savin Treanţă
On Lagrange Duality for Several Classes of Nonconvex
Optimization Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
Ewa M. Bednarczuk and Monika Syga
On Monotone Maps: Semidifferentiable Case . . . . . . . . . . . . . . . . . . . . 182
Shashi Kant Mishra, Sanjeev Kumar Singh, and Avanish Shahi
Parallel Multi-memetic Global Optimization Algorithm for Optimal
Control of Polyarylenephthalide’s Thermally-Stimulated
Luminescence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
Maxim Sakharov and Anatoly Karpenko
Proper Choice of Control Parameters for CoDE Algorithm . . . . . . . . . 202
Petr Bujok, Daniela Einšpiglová, and Hana Zámečníková
Semidefinite Programming Based Convex Relaxation for Nonconvex
Quadratically Constrained Quadratic Programming . . . . . . . . . . . . . . 213
Rujun Jiang and Duan Li
Contents xv
Solving a Type of the Tikhonov Regularization of the Total

Least Squares by a New S-Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
Huu-Quang Nguyen, Ruey-Lin Sheu, and Yong Xia
Solving Mathematical Programs with Complementarity
Constraints with a Penalization Approach . . . . . . . . . . . . . . . . . . . . . . 228
Lina Abdallah, Tangi Migot, and Mounir Haddou
Stochastic Tunneling for Improving the Efficiency of Stochastic
Efficient Global Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
Fábio Nascentes, Rafael Holdorf Lopez, Rubens Sampaio,
and Eduardo Souza de Cursi
The Bernstein Polynomials Based Globally Optimal Nonlinear
Model Predictive Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
Bhagyesh V. Patil, Ashok Krishnan, Foo Y. S. Eddy, and Ahmed Zidna
Towards the Biconjugate of Bivariate Piecewise Quadratic
Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
Deepak Kumar and Yves Lucet
Tractable Relaxations for the Cubic One-Spherical Optimization
Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
Christoph Buchheim, Marcia Fampa, and Orlando Sarmiento
DC Programming and DCA

A DC Algorithm for Solving Multiobjective Stochatic Problem
via Exponential Utility Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
Ramzi Kasri and Fatima Bellahcene
A DCA-Based Approach for Outage Constrained Robust Secure
Power-Splitting SWIPT MISO System . . . . . . . . . . . . . . . . . . . . . . . . . 289
Phuong Anh Nguyen and Hoai An Le Thi
DCA-Like, GA and MBO: A Novel Hybrid Approach for Binary
Quadratic Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
Sara Samir, Hoai An Le Thi, and Mohammed Yagouni
Low-Rank Matrix Recovery with Ky Fan 2-k-Norm . . . . . . . . . . . . . . 310
Xuan Vinh Doan and Stephen Vavasis
Online DCA for Times Series Forecasting Using Artificial Neural
Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320
Viet Anh Nguyen and Hoai An Le Thi
Parallel DC Cutting Plane Algorithms for Mixed Binary Linear
Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
Yi-Shuai Niu, Yu You, and Wen-Zhuo Liu
xvi Contents
Sentence Compression via DC Programming Approach . . . . . . . . . . . . 341

Yi-Shuai Niu, Xi-Wei Hu, Yu You, Faouzi Mohamed Benammour,
and Hu Zhang
Discrete Optimization and Network Optimization

A Horizontal Method of Localizing Values of a Linear Function
in Permutation-Based Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
Liudmyla Koliechkina and Oksana Pichugina
An Experimental Comparison of Heuristic Coloring Algorithms
in Terms of Found Color Classes on Random Graphs . . . . . . . . . . . . . 365
Deniss Kumlander and Aleksei Kulitškov
Cliques for Multi-term Linearization of 0–1 Multilinear Program
for Boolean Logical Pattern Generation . . . . . . . . . . . . . . . . . . . . . . . . 376
Kedong Yan and Hong Seo Ryoo
Gaining or Losing Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387
Jon Lee, Daphne Skipper, and Emily Speakman
Game Equilibria and Transition Dynamics with Networks
Unification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398
Alexei Korolev and Ilia Garmashov
Local Search Approaches with Different Problem-Specific Steps
for Sensor Network Coverage Optimization . . . . . . . . . . . . . . . . . . . . . 407
Krzysztof Trojanowski and Artur Mikitiuk
Modelling Dynamic Programming-Based Global Constraints
in Constraint Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
Andrea Visentin, Steven D. Prestwich, Roberto Rossi, and Armagan Tarim
Modified Extended Cutting Plane Algorithm for Mixed Integer
Nonlinear Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428
Wendel Melo, Marcia Fampa, and Fernanda Raupp
On Proximity for k-Regular Mixed-Integer Linear Optimization . . . . . 438
Luze Xu and Jon Lee
On Solving Nonconvex MINLP Problems with SHOT . . . . . . . . . . . . . 448
Andreas Lundell and Jan Kronqvist
Reversed Search Maximum Clique Algorithm Based
on Recoloring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458
Deniss Kumlander and Aleksandr Porošin
Sifting Edges to Accelerate the Computation of Absolute 1-Center
in Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 468
Wei Ding and Ke Qiu
Contents xvii
Solving an MINLP with Chance Constraint Using a Zhang’s

Copula Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477
Adriano Delfino
Stochastic Greedy Algorithm Is Still Good: Maximizing
Submodular + Supermodular Functions . . . . . . . . . . . . . . . . . . . . . . . . 488
Sai Ji, Dachuan Xu, Min Li, Yishui Wang, and Dongmei Zhang
Towards Multi-tree Methods for Large-Scale Global Optimization . . . 498
Pavlo Muts and Ivo Nowak
Optimization under Uncertainty

Fuzzy Pareto Solutions in Fully Fuzzy Multiobjective Linear
Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 509
Manuel Arana-Jiménez
Minimax Inequalities and Variational Equations . . . . . . . . . . . . . . . . . 518
Maria Isabel Berenguer, Domingo Gámez, A. I. Garralda–Guillem,
and M. Ruiz Galán
Optimization of Real-Life Integrated Solar Desalination Water
Supply System with Probability Functions . . . . . . . . . . . . . . . . . . . . . . 526
Bayrammyrat Myradov
Social Strategy of Particles in Optimization Problems . . . . . . . . . . . . . 537
Bożena Borowska
Statistics of Pareto Fronts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547
Mohamed Bassi, E. Pagnacco, Eduardo Souza de Cursi, and R. Ellaia
Uncertainty Quantification in Optimization . . . . . . . . . . . . . . . . . . . . . 557
Eduardo Souza de Cursi and Rafael Holdorf Lopez
Uncertainty Quantification in Serviceability of Impacted Steel
Pipe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567
Renata Troian, Didier Lemosse, Leila Khalij, Christophe Gautrelet,
Multiobjective Programming
A Global Optimization Algorithm for the Solution of Tri-Level
Mixed-Integer Quadratic Programming Problems . . . . . . . . . . . . . . . . 579
Styliani Avraamidou and Efstratios N. Pistikopoulos
A Method for Solving Some Class of Multilevel Multi-leader
Multi-follower Programming Problems . . . . . . . . . . . . . . . . . . . . . . . . . 589
Addis Belete Zewde and Semu Mitiku Kassa
xviii Contents
A Mixture Design of Experiments Approach for Genetic Algorithm

Tuning Applied to Multi-objective Optimization . . . . . . . . . . . . . . . . . . 600
Taynara Incerti de Paula, Guilherme Ferreira Gomes,
José Henrique de Freitas Gomes, and Anderson Paulo de Paiva
A Numerical Study on MIP Approaches over the Efficient Set . . . . . . . 611
Kuan Lu, Shinji Mizuno, and Jianming Shi
Analytics-Based Decomposition of a Class of Bilevel Problems . . . . . . . 617
Adejuyigbe Fajemisin, Laura Climent, and Steven D. Prestwich
KMCGO: Kriging-Assisted Multi-objective Constrained Global
Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 627
Yaohui Li, Yizhong Wu, Yuanmin Zhang, and Shuting Wang
Multistage Global Search Using Various Scalarization Schemes
in Multicriteria Optimization Problems . . . . . . . . . . . . . . . . . . . . . . . . 638
Victor Gergel and Evgeniy Kozinov
Necessary Optimality Condition for Nonlinear Interval Vector
Programming Problem Under B-Arcwise Connected Functions . . . . . . 649
Mohan Bir Subba and Vinay Singh
On the Applications of Nonsmooth Vector Optimization Problems
to Solve Generalized Vector Variational Inequalities Using
Convexificators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 660
Balendu Bhooshan Upadhyay, Priyanka Mishra, Ram N. Mohapatra,
and Shashi Kant Mishra
SOP-Hybrid: A Parallel Surrogate-Based Candidate Search
Algorithm for Expensive Optimization on Large Parallel Clusters . . . . 672
Taimoor Akhtar and Christine A. Shoemaker
Surrogate Many Objective Optimization: Combining Evolutionary
Search, -Dominance and Connected Restarts . . . . . . . . . . . . . . . . . . . 681
Taimoor Akhtar, Christine A. Shoemaker, and Wenyu Wang
Tropical Analogues of a Dempe-Franke Bilevel Optimization
Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 691
Sergeĭ Sergeev and Zhengliang Liu
U Weak Slater Constraint Qualification in Nonsmooth
Multiobjective Semi-infinite Programming . . . . . . . . . . . . . . . . . . . . . . 702
Ali Sadeghieh, David Barilla, Giuseppe Caristi, and Nader Kanzi
Contents xix
Data science: Machine Learning, Data Analysis, Big Data

and Computer Vision
A Discretization Algorithm for k-Means with Capacity Constraints . . . 713
Yicheng Xu, Dachuan Xu, Dongmei Zhang, and Yong Zhang
A Gray-Box Approach for Curriculum Learning . . . . . . . . . . . . . . . . . 720
Francesco Foglino, Matteo Leonetti, Simone Sagratella,
and Ruggiero Seccia
A Study on Graph-Structured Recurrent Neural Networks
and Sparsification with Application to Epidemic Forecasting . . . . . . . . 730
Zhijian Li, Xiyang Luo, Bao Wang, Andrea L. Bertozzi, and Jack Xin
Automatic Identification of Intracranial Hemorrhage on CT/MRI
Image Using Meta-Architectures Improved from Region-Based
CNN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 740
Thi-Hoang-Yen Le, Anh-Cang Phan, Hung-Phi Cao,
and Thuong-Cang Phan
Bayesian Optimization for Recommender System . . . . . . . . . . . . . . . . . 751
Bruno Giovanni Galuzzi, Ilaria Giordani, A. Candelieri, Riccardo Perego,
and Francesco Archetti
Creation of Data Classification System for Local Administration . . . . . 761
Raissa Uskenbayeva, Aiman Moldagulova, and Nurzhan K. Mukazhanov
Face Recognition Using Gabor Wavelet in MapReduce and Spark . . . 769
Anh-Cang Phan, Hung-Phi Cao, Ho-Dat Tran, and Thuong-Cang Phan
Globally Optimal Parsimoniously Lifting a Fuzzy Query Set
Over a Taxonomy Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 779
Dmitry Frolov, Boris Mirkin, Susana Nascimento, and Trevor Fenner
K-Medoids Clustering Is Solvable in Polynomial Time for a 2d
Pareto Front . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 790
Nicolas Dupin, Frank Nielsen, and El-Ghazali Talbi
Learning Sparse Neural Networks via ‘0 and T‘1 by a Relaxed
Variable Splitting Method with Application to Multi-scale Curve
Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 800
Fanghui Xue and Jack Xin
Pattern Recognition with Using Effective Algorithms and Methods
of Computer Vision Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 810
S. B. Mukhanov and Raissa Uskenbayeva
xx Contents
The Practice of Moving to Big Data on the Case of the NoSQL

Database, Clickhouse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 820
Baktagul Imasheva, Azamat Nakispekov, Andrey Sidelkovskaya,
and Ainur Sidelkovskiy
Economics and Finance

Asymptotically Exact Minimizations for Optimal Management
of Public Finances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 831
Jean Koudi, Babacar Mbaye Ndiaye, and Guy Degla
Features of Administrative and Management Processes Modeling . . . . 842
Ryskhan Satybaldiyeva, Raissa Uskenbayeva, Aiman Moldagulova,
Zuldyz Kalpeyeva, and Aygerim Aitim
Optimization Problems of Economic Structural Adjustment
and Problem of Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 850
Abdykappar Ashimov, Yuriy Borovskiy, and Mukhit Onalbekov
Research of the Relationship Between Business Processes
in Production and Logistics Based on Local Models . . . . . . . . . . . . . . . 861
Raissa Uskenbayeva, Kuandykov Abu, Rakhmetulayeva Sabina,
and Bolshibayeva Aigerim
Sparsity and Performance Enhanced Markowitz Portfolios
Using Second-Order Cone Programming . . . . . . . . . . . . . . . . . . . . . . . 871
Noam Goldberg and Ishy Zagdoun
Managing Business Process Based on the Tonality of the Output
Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 882
Raissa Uskenbayeva, Rakhmetulayeva Sabina, and Bolshibayeva Aigerim
Energy and Water Management

Customer Clustering of French Transmission System Operator
(RTE) Based on Their Electricity Consumption . . . . . . . . . . . . . . . . . . 893
Gabriel Da Silva, Hoai Minh Le, Hoai An Le Thi, Vincent Lefieux,
and Bach Tran
Data-Driven Beetle Antennae Search Algorithm for Electrical
Power Modeling of a Combined Cycle Power Plant . . . . . . . . . . . . . . . 906
Tamal Ghosh, Kristian Martinsen, and Pranab K Dan
Finding Global-Optimal Gearbox Designs for Battery Electric
Vehicles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 916
Philipp Leise, Lena C. Altherr, Nicolai Simon, and Peter F. Pelz
Contents xxi
Location Optimization of Gas Power Plants by a Z-Number

Data Envelopment Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 926
Farnoosh Fakhari, R. Tavakkoli-Moghaddam, M. Tohidifard,
and Seyed Farid Ghaderi
Optimization of Power Plant Operation via Stochastic Programming
with Recourse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 937
Tomoki Fukuba, Takayuki Shiina, Ken-ichi Tokoro, and Tetsuya Sato
Randomized-Variants Lower Bounds for Gas Turbines Aircraft
Engines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 949
Mahdi Jemmali, Loai Kayed B. Melhim, and Mafawez Alharbi
Robust Design of Pumping Stations in Water Distribution
Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 957
Gratien Bonvin, Sophie Demassey, and Welington de Oliveira
Engineering Systems
Application of PLS Technique to Optimization of the Formulation
of a Geo-Eco-Material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 971
S. Imanzadeh, Armelle Jarno, and S. Taibi
Databases Coupling for Morphed-Mesh Simulations and Application
on Fan Optimal Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 981
Zebin Zhang, Martin Buisson, Pascal Ferrand, and Manuel Henner
Kriging-Based Reliability-Based Design Optimization Using
Single Loop Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 991
Hongbo Zhang, Younes Aoues, Hao Bai, Didier Lemosse,
Sensitivity Analysis of Load Application Methods for Shell Finite
Element Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1001
Wilson Javier Veloz Parra, Younes Aoues, and Didier Lemosse
Transportation, Logistics, Resource Allocation and Production

Management
A Continuous Competitive Facility Location and Design Problem
for Firm Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1013
Boglárka G.-Tóth, Laura Anton-Sanchez, José Fernández,
Juana L. Redondo, and Pilar M. Ortigosa
A Genetic Algorithm for Solving the Truck-Drone-ATV Routing
Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1023
Mahdi Moeini and Hagen Salewski
xxii Contents
A Planning Problem with Resource Constraints in Health

Simulation Center . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1033
Simon Caillard, Laure Brisoux Devendeville, and Corinne Lucet
Edges Elimination for Traveling Salesman Problem Based
on Frequency K5 s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1043
Yong Wang
Industrial Symbioses: Bi-objective Model and Solution Method . . . . . . 1054
Sophie Hennequin, Vinh Thanh Ho, Hoai An Le Thi, Hajar Nouinou,
and Daniel Roy
Intelligent Solution System Towards Parts Logistics Optimization . . . . 1067
Yaoting Huang, Boyu Chen, Wenlian Lu, Zhong-Xiao Jin, and Ren Zheng
Optimal Air Traffic Flow Management with Carbon Emissions
Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1078
Sadeque Hamdan, Oualid Jouini, Ali Cheaitou, Zied Jemai,
Imad Alsyouf, and Maamar Bettayeb
Scheduling Three Identical Parallel Machines with Capacity
Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1089
Jian Sun, Dachuan Xu, Ran Ma, and Xiaoyan Zhang
Solving the Problem of Coordination and Control of Multiple
UAVs by Using the Column Generation Method . . . . . . . . . . . . . . . . . 1097
Duc Manh Nguyen, Frédéric Dambreville, Abdelmalek Toumi,
Jean-Christophe Cexus, and Ali Khenchaf
Spare Parts Management in the Automotive Industry Considering
Sustainability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1109
David Alejandro Baez Diaz, Sophie Hennequin, and Daniel Roy
The Method for Managing Inventory Accounting . . . . . . . . . . . . . . . . 1119
Duisebekova Kulanda, Kuandykov Abu, Rakhmetulayeva Sabina,
and Kozhamzharova Dinara
The Traveling Salesman Drone Station Location Problem . . . . . . . . . . 1129
Daniel Schermer, Mahdi Moeini, and Oliver Wendt
Two-Machine Flow Shop with a Dynamic Storage Space
and UET Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1139
Joanna Berlińska, Alexander Kononov, and Yakov Zinder
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1149
Continuous Optimization
A Hybrid Simplex Search for Global
Optimization with Representation
Formula and Genetic Algorithm
Hafid Zidani1,2(B) , Rachid Ellaia1 , and Eduardo Souza de Cursi2

1
LERMA, Mohammed V University - Engineering Mohammedia School, Rabat,
BP. 765 Ibn Sina avenue, Agdal, Morocco
hafidzidani@yahoo.fr,h.zidani@insa-rouen.fr, ellaia@emi.ac.ma
2
Laboratory of Mechanics of Normandy, National Institute for Applied Sciences -
Rouen, BP. 08, université avenue, 76801 St Etienne du Rouvray Cedex, France
eduardo.souza@insa-rouen.fr
Abstract. We consider the problem of minimizing a given function

f : Rn −→ R on a regular not empty closed set S. When f attains a
global minimum at exactly one point x∗ ∈ S, for a convenient random
variable X and a convenient function g : R2 −→ R. In this paper, we
propose to use this Representation Formula (RF) to numerically gener-
ate an initial population. In order to obtain a more accurate results, the
Representation Formula has been coupled with other algorithms:
• Classical Genetic Algorithm (GA). We obtain a new algorithm called
(RFGA),
• Genetic Algorithm using Nelder Mead algorithm at the mutation
stage (GANM). We obtain a new algorithm called (RFGANM),
• Nelder Mead Algorithm. We obtain a new algorithm called (RFNM).
All these six algorithms (RF, GA, RFGA, GANM, RFGANM, RFNM)
were tested on 21 benchmark functions with a complete analysis of the
effect of different parameters of the methods. The experiments show that
the RFNM is the most successful algorithm. Its performance was com-
pared with the other algorithms, and observed to be the more effective,
robust, and stable than the others.
Keywords: Global optimization · Genetic algorithm · Representation

formula · Nelder Mead algorithm
1 Introduction
In the context of the resolution of engineering problems, many optimization algo-
rithms have been proposed, tested and analyzed in the last decades. However,
optimization in engineering remains an active research field, since many real-
world engineering optimization problems remain very complex in nature and
quite difficult to be solved by the existing algorithms. The existing literature
presents intensive research efforts to solve some difficulty points, which remains
c Springer Nature Switzerland AG 2020
H. A. Le Thi et al. (Eds.): WCGO 2019, AISC 991, pp. 3–15, 2020.
https://doi.org/10.1007/978-3-030-21803-4_1
4 H. Zidani et al.
still incompletely solved and for which only partial response has been obtained.
Among these, we may cite: handling of non convexity - specially when optimiza-
tion restrictions are involved, working with incomplete or erroneous evaluation
of the functions and restrictions, increasing the number of optimization variables
up to those of realistic designs in practical situations, dealing with non regular
(discontinuous or non-differentiable) functions, determining convenient starting
points for iterative methods. Floudas [5] We observe that the difficulties con-
cerning non-convexity and the determination of starting points are connected:
efficient methods for the optimization of regular functions are often deterministic
and involve gradients, but depends strongly on the initial point - they can be
trapped by local minima if a non convenient initial guess is used. Alternatively,
methods based on the exploration of the space of the design variables usually
involve a stochastic aspect - thus, a significant increase in the computational cost
- and are less dependent of the initial choice, but improvements in their perfor-
mance request combination with deterministic methods and may introduce a
dependence on the initial choice. This last approach tends to the use of hybrid
procedures involving both approaches and try to benefit from the best of each
method - by these reasons, the literature about mixed stochastic/deterministic
methods has grown in the last years [2]. Those hybrid algorithms perform better
if the initial point belongs to an attraction area of the optimum. This shows the
importance of the initial guesses in optimization algorithm [8]. Hence, we would
like in this paper to use a representation formula to provide a convenient initial
guess of the solution. Let S denote a closed bounded regular domain of the n-
dimensional Euclidean space Rn , and let f be a continuous function defined on
S and taking its values on R. An unconstrained optimization problem can be
formulated, in general, as follows:
x∗ = Arg min f (x) , (1)

x∈S
In the literature, representation formulas have been introduced in order to char-

acterize explicitly solutions of the problem 1. In general, these representations
assume that S contains a single optimal point x∗ (but many local minima may
exist on S). For instance, Pincus [9] has proposed the representation formula:

x e−λf (x) dx
x = lim S −λf (x) .
λ→+∞ e dx
S
More recently, the original representation proposed by Pincus has been refor-
mulated by Souza de Cursi [3] as follows: let X be a random variable taking its
values on S and g : R2 −→ R be a function. If these elements are conveniently
chosen, then
E (X g (λ, f (X)))
x∗ = lim (2)
λ→+∞ E (g (λ, f (X)))
The formulation of Pincus corresponds to g (λ, s) = e−λs , what is a convenient

choice. The general properties of X and g are detailed, for instance, in [4]. An
extension to infinite dimensional situations can be found in [4]. In this work,
A Hybrid Simplex Search for Global Optimization 5
we propose the use of the representation given by Eq. (3) hybridized with the
Nelder Mead algorithm and a genetic algorithm, for the global optimization of
multimodal functions.
2 Hybrid Simplex Search with Representation Formula

and Genetic Algorithm
Hybrid methods have been introduced to keep the flexibility of the stochastic
methods and the efficiency of the deterministic one. In our paper, the hybrid
method for solving optimization problems is a coupling of the representation
formula proposed by Pincus [9] and Nelder Mead algorithm and genetic algo-
rithm. The representation formula is used first to find the region containing the
global solution, based on the generating of finite samples of the random variables
involved in the expression and an approximation of the limits. For instance, we
may choose λ large enough and generate a sample by using standard random
number generators.
The generation of points can be done either using a normal distribution or a
Gaussian one. In the case of Gaussian distribution, when a trial point generated
lies outside S, it has been projected in order to get admissible point.
In order to obtain a more accurate results, it is convenient to use the improve-
ment by the following algorithms:
– Classical Genetic Algorithm (GA). We obtain a new algorithm called

(RFGA).
– Genetic Algorithm using Nelder Mead algorithm at the mutation stage
(GANM). We obtain a new algorithm called (RFGANM).
– Nelder Mead Algorithm. We obtain a new algorithm called (RFNM).
2.1 Representation Formula
As previously observed, if f attains its global minimum at exactly one point x∗

on S, we have
E (X g (λ, f (X)))
x∗ = lim , (3)
λ→+∞ E (g (λ, f (X)))
where g : R2 −→ R is continuous and strictly positive, s −→ g : (λ, s) is

strictly decreasing for any s ∈ f (S) and λ > 0, while X is a convenient ran-
dom variable. These conditions are fullfilled, for instance, when X is uniformly
distributed or gaussian and g (λ, s) = e−λs (what corresponds to the classical
choice of Pincus). We use these particular choices in the sequel.
A numerical implementation can be performed by taking a large fixed value
of λ in order to represent the limit λ → +∞. In order to prevent an overflow, λ
should be increased gradually up to the desired value and it may be convenient
to use positive functions f (for instance, by adding a constant to the original f ).
A finite sample of X is generated, according to a probability P - this consists
6 H. Zidani et al.
simply in generating N admissible points (x1 , x2 , ..., xN ) ∈ S - and estimations

of the means are used to approximate the exact means, what leads to
N

xi g (λ, f (xi ))
∗ i=1
x x∗c = n

g (λ, f (xi ))
i=1
3 Test Bed
To demonstrate the efficiency and the accuracy of the hybrid algorithms, 21
typical benchmark functions of different levels of complexity and multimodality
were chosen from the global optimization literature [6].
One hundred runs have been performed for each test function to estimate
the probability of success of the used methods. The used test functions are:
Bohachevsky 1 BO1, Bohachevsky 2 BO2, Branin function BR, Camel func-
tion CA, Cosine mixture function CO, DeJoung function DE, Goldstein and
price function GO, Griewank function GR, Hansen function HN, Hartman 3
function HR3, Hartman 6 function HR6, Rastrigin function RA, Rosenbrock
function RO, Shekel 5 SK5, Shekel 7 SK7, Shekel 10 SK10, Shubert 1 func-
tion SH1, Shubert 2 function SH2, Shubert 3 function SH3, Shubert 4 function
SH4 and Wolfe nondifferentiable function WO.
4 Numerical Results
In this section we focus on the efficiency of the six algorithms, i.e. Representation
Formula (RF), Classical Genetic Algorithm (GA), Representation Formula with
GA (RFGA), Genetic Algorithm using Nelder Mead algorithm at the mutation
stage (GANM), Representation Formula with GA and Nelder Mead (RFGANM),
and Representation Formula with Nelder Mead (RFNM).
A series of experiments have been done to make some performance analysis
about them. To avoid attributing the optimization results to the choice of a
particular conditions and to conduct fair comparisons, we have performed each
test 100 times, starting from various randomly selected points in the hyper rect-
angular search domain.
The used parameters in genetic algorithm are: Population size: from 2 to
50, the mutation rate is set to 0.2, the selection method is rank weighting, the
stopping criteria are maximum iteration (set to 2000 for GA) and the maximum
number of continuous iterations without improving the solution, and it is set to
1000 for GA.
Concerning NM we adopted the standard parameters recommended by the
authors, the used stoping criteria are: Maximum function evaluation, maxf un =
50000, maximum of iteration, maxiter = 10000, termination tolerance on the
function value tolf = 10−5 and termination on the tolerance on xtolx = 10−6 .
Extensive experimentations concerning the effect of different parameters have

been performed: influence of the Pincus function, influence of the population size
for GA, influence of the sample size for RF and comparison with others methods.
Because of the limitation of space, only a few experiments are presented, chosen
in order to illustrate significant aspects.
The following abbreviations are introduced to lighten the text:
TestF: Test Function; Dim: Dimension; PopGA: Genetic algorithm popula-
tion size; SS: Sample size used in RF; SR: Success rate; SD: Standard deviation;
CPUT: CPU Time (in seconds); NEvalF: Number of function evaluations; The
reported results are in terms of the rate of successful optimizations (SR), the
standard deviation, the CPU time and the average of the function evaluation
number. The term SR is the number of successful run i.e. when the algorithm
generates a solution with a required accuracy, where the ’required accuracy’ is a
given maximum value that is calculated as the absolute difference between the
solution found and the global optimum divided by the global solution (when it
is non zero). The chosen accuracy for all the tests is 10−4 .
4.1 Influence of the Pincus Function

In the representation formula, the Pincus expression corresponds to g (λ, s) =
e−λs , which is a convenient choice. In our experiments, four other descent func-
tions (continuous and strictly decreasing) have been used for solving the bench-
−λ
mark functions with the proposed algorithms: λ 1s3 , λ ln(s)
1
, es3 , and 10(−λ s) .
The tests show that the choice of the function has no significant effect on
the solution quality and algorithms performance, a small difference in terms of
execution time has been observed.
4.2 Influence of the Population Size Used in GA

To examine the effect of the population size on the solution quality as well as
the computational efforts for acquiring the optimal solution using GA, GANM,
RFGA and RFGANM, eight levels of size (2, 4, 6, 8, 12, 16, 20 and 50) are exam-
ined and the experimental results are reported in Table 1 for GA and GANM
methods and Table 2 for RFGA and RFGANM. The sample size for RF depends
on the function dimension: SS = 60 for Dim = 1 or 2, SS = 100 for Dim = 3,
SS = 300 for Dim = 4 or 5, and SS=500 for Dim =6 or 10. One of the general
observations is that as the population size increase, all of them require more
time (more number of function evaluations) and tend to find better solution, as
indicated by smaller standard deviation and bigger success rate. In the case of
more complex problem (case Rosenbrock functions) the GA and RFGA failed to
find a solution with the required accuracy (SR = 0%) even for P opGA = 200.
A slight improvement is obtained by the representation formula (RFGA).
A worsening of the solution obtained by GA is observed for GANM (except
for SK10, BO2, GR and RO). The results obtained by RFGANM are the best
in term of success rate (SR=100% for PopGA=50) with additional number of
function evaluation.
8 H. Zidani et al.
Table 1. Influence of the population size for GA and GANM
TestF Dim PopGA GA GANM

SR SD CPUT NEvalF SR SD CPUT NEvalF
SH4 4 12 60% 2,13E-02 0,35 23 613 4% 2,68E-01 14,88 213 782
SH4 4 20 94% 2,13E-02 0,43 39 860 19% 3,16E-01 25,05 360 290
SH4 4 50 100% 5,77E-06 0,72 98 557 77% 1,11E-01 74,01 1 060 096
SK10 4 12 27% 3,52E-01 0,80 23 812 83% 2,08E-01 68,77 218 926
SK10 4 20 33% 3,33E-01 0,88 39 762 94% 1,22E-01 120,73 381 753
SK10 4 50 61% 3,31E-01 1,12 95 898 100% 5,64E-08 341,22 1 079 693
BO2 2 12 24% 1,07E-01 0,23 20 562 100% 5,72E-11 0,29 6 359
BO2 2 20 53% 1,06E-01 0,25 31 919 100% 2,68E-11 0,48 10 824
BO2 2 50 85% 7,61E-02 0,37 72 035 100% 3,47E-12 1,24 28 147
CA 2 12 93% 1,00E-04 0,24 20 489 98% 1,11E-01 3,39 64 233
CA 2 20 100% 1,39E-05 0,28 33 515 100% 3,72E-08 5,74 109 435
CA 2 50 100% 6,06E-06 0,43 77 729 100% 3,72E-08 15,85 303 313
CO 4 12 100% 1,16E-05 0,30 23 818 29% 2,46E-01 1,87 35 774
CO 4 20 100% 1,22E-06 0,35 39 336 65% 2,02E-01 1,91 38 478
CO 4 50 100% 1,09E-06 0,56 93 199 100% 6,88E-08 2,41 54 340
GR 5 12 0% 1,87E-01 0,33 23 998 93% 4,17E-02 24,97 376 071
GR 5 20 1% 1,75E-01 0,39 39 884 99% 1,46E-02 42,64 642 752
GR 5 50 44% 1,18E-01 0,67 102 050 100% 1,95E-13 113,94 1 716 703
HN 2 12 81% 5,06E-02 0,27 21 712 52% 2,62E-01 4,69 80 671
HN 2 20 94% 3,85E-02 0,32 36 018 76% 1,29E-01 8,29 142 888
HN 2 50 100% 3,19E-06 0,47 82 607 99% 1,76E-02 22,37 384 416
RA 2 12 27% 5,05E-03 0,22 19 169 51% 1,24E+00 1,35 26 337
RA 2 20 64% 3,62E-04 0,26 32 278 67% 6,11E-01 1,57 31 274
RA 2 50 93% 1,11E-04 0,34 67 520 98% 1,40E-01 1,61 35 270
RO 10 12 0% 3,45E+01 0,33 24 012 100% 7,88E-12 44,78 683 973
RO 10 20 0% 3,37E+01 0,40 40 020 100% 5,49E-12 75,83 1 161 545
RO 10 50 0% 2,92E+01 0,69 102 050 100% 2,99E-12 200,54 3 071 508
WO 2 12 1% 4,63E-02 0,23 20 512 100% 0,00E+00 2,04 39 624
WO 2 20 6% 2,59E-02 0,25 31 105 100% 0,00E+00 3,55 69 076
WO 2 50 43% 1,07E-02 0,37 70 768 100% 0,00E+00 9,42 183 422
4.3 Influence of the Sample Size for RF
To investigate the effect of the simple size used in the representation formula on
searching quality of the hybrid algorithms, we choose six level of sample sizes
(SS = 60, 100, 500, 1000, 2000 and 5000) and we set the value of population size
to 6. The experiments results are reported in Table 3. Because of the limitation of
space, only 10 test functions are presented, chosen in order to illustrate significant
aspects.
We observe that the success rate and the number of function evaluation
increase with the sample size. The RF method failed in almost the tests (SR =
0% in the most cases) except in the case of HR3 function (SR = 100% f or SS =
Table 2. Influence of the population size for RFGA and RFGANM
TestF Dim PopGA RFGA RFGANM

SR SD CPUT NEvalF SR SD CPUT NEvalF
SH4 4 12 50% 5,08E-02 0,57 41 613 90% 6,42E-02 14,82 228 329
SH4 4 20 89% 2,13E-02 0,80 69 809 98% 3,00E-02 25,57 391 947
SH4 4 50 100% 7,11E-06 1,63 172 398 100% 5,05E-08 70,56 1 072 064
SK10 4 12 45% 2,64E-01 1,12 41 626 98% 7,16E-02 68,23 234 259
SK10 4 20 84% 1,95E-01 1,41 69 439 100% 5,64E-08 117,89 401 052
SK10 4 50 94% 1,22E-01 2,45 170 297 100% 5,64E-08 342,74 1 155 194
BO2 2 12 34% 9,73E-02 0,26 24 310 98% 3,07E-02 0,36 10 864
BO2 2 20 62% 8,04E-02 0,29 38 068 100% 3,02E-11 0,50 16 495
BO2 2 50 92% 3,74E-02 0,43 82 245 100% 4,28E-12 1,27 41 913
CA 2 12 96% 4,03E-05 0,27 23 890 100% 3,72E-08 3,35 66 514
CA 2 20 99% 2,33E-05 0,31 38 013 100% 3,72E-08 5,72 114 142
CA 2 50 100% 4,52E-06 0,49 84 487 100% 3,72E-08 15,76 314 430
CO 4 12 100% 1,52E-05 0,41 41 790 100% 6,88E-08 0,58 28 515
CO 4 20 100% 1,36E-06 0,56 69 246 100% 6,88E-08 0,97 47 799
CO 4 50 100% 4,89E-07 1,07 170 920 100% 6,88E-08 2,52 122 363
GR 5 12 0% 1,76E-01 0,47 42 012 74% 9,71E-02 25,04 392 890
GR 5 20 0% 1,62E-01 0,64 70 011 88% 5,05E-02 42,92 673 123
GR 5 50 46% 1,24E-01 1,27 177 050 100% 2,13E-13 114,24 1 787 308
HN 2 12 97% 4,53E-05 0,32 25 478 100% 6,37E-08 4,69 83 688
HN 2 20 100% 1,31E-05 0,39 40 485 100% 6,37E-08 8,08 144 040
HN 2 50 100% 1,68E-06 0,65 93 454 100% 6,37E-08 22,25 393 771
RA 2 12 26% 1,43E-03 0,25 23 131 95% 2,18E-01 0,39 11 430
RA 2 20 53% 5,02E-04 0,29 36 968 98% 1,40E-01 0,56 17 506
RA 2 50 95% 7,22E-05 0,44 83 305 100% 1,41E-11 1,28 42 039
RO 10 12 0% 1,97E+00 0,60 54 012 100% 8,42E-12 43,90 696 293
RO 10 20 0% 9,84E-01 0,85 90 020 100% 4,57E-12 74,82 1 188 783
RO 10 50 0% 6,08E-01 1,80 227 050 100% 2,36E-12 197,37 3 130 515
WO 2 12 0% 6,09E-02 0,27 23 663 100% 0,00E+00 2,05 42 744
WO 2 20 6% 2,96E-02 0,30 35 050 100% 0,00E+00 3,52 73 376
WO 2 50 46% 6,06E-03 0,50 82 856 100% 0,00E+00 9,55 198 171
5000), CA (SR = 97% f or SS = 5000), SH1 (SR = 100% f or SS = 60) and

SH2 (SR = 100% f or SS = 5000) with an accuracy of 10−3 .
Regarding the RFGANM algorithm, the results are improved (SR = 100%) as
soon as SS ≥ 1000 except for SH4, SK10 and GR. A success rate of 100% is
obtained for the RFNM even for sample size SS = 60.
4.4 Comparison Between Methods
The results presented in Tables 4 and 5 are based on P opGA = 12 for GA and
GANM algorithms, and P opGA = 6 for the others. The sample size chosen for
10 H. Zidani et al.
Table 3. Influence of the sample size used in RF
TestF Dim RF Size RF SR RF Neval RFGA RFGA RFGANM RFGANM RFNM RFNM
SR Neval SR Neval SR Neval
SH4 4 60 0% 1 800 7% 15 460 38% 103 489 43% 7 981
SH4 4 100 0% 3 000 9% 16 892 34% 104 957 47% 9 137
SH4 4 500 0% 15 000 8% 28 885 74% 117 804 100% 21 010
SH4 4 5000 0% 150 000 10% 163 800 97% 252 327 100% 155 899
SK10 4 60 0% 1 800 0% 15 470 78% 108 783 100% 8 002
SK10 4 100 0% 3 000 2% 16 669 72% 110 233 100% 9 209
SK10 4 500 0% 15 000 4% 28 783 91% 121 826 100% 21 198
SK10 4 5000 0% 150 000 1% 163 758 99% 255 924 100% 155 946
BO2 2 60 0% 1 800 10% 14 320 83% 9 338 100% 4 421
BO2 2 100 0% 3 000 9% 15 477 78% 11 953 100% 5 532
BO2 2 500 0% 15 000 14% 26 791 92% 20 139 100% 17 332
BO2 2 5000 0% 150 000 17% 160 866 100% 152 901 100% 152 160
CA 2 60 1% 1 800 68% 13 957 100% 32 760 100% 4 338
CA 2 100 1% 3 000 71% 15 106 100% 33 982 100% 5 565
CA 2 500 1% 15 000 71% 27 101 100% 45 952 100% 17 481
CA 2 5000 11% 150 000 73% 160 307 100% 180 812 100% 152 405
CO 4 60 0% 1 800 58% 15 421 92% 8 142 96% 8 990
CO 4 100 0% 3 000 50% 16 752 93% 9 167 96% 10 219
CO 4 500 0% 15 000 43% 28 507 100% 20 130 100% 22 540
CO 4 5000 0% 150 000 52% 163 426 100% 155 101 100% 157 744
GR 5 60 0% 1 800 0% 15 685 43% 183 996 97% 27 227
GR 5 100 0% 3 000 0% 16 866 32% 185 066 100% 28 212
GR 5 500 0% 15 000 0% 28 872 49% 196 576 100% 40 228
GR 5 5000 0% 150 000 0% 163 962 33% 330 770 100% 172 940
HN 2 60 1% 1 800 63% 14 338 100% 41 534 100% 4 008
HN 2 100 2% 3 000 63% 15 387 100% 42 604 100% 5 252
HN 2 500 1% 15 000 57% 27 103 100% 54 571 100% 17 107
HN 2 5000 14% 150 000 72% 161 323 100% 189 857 100% 152 137
RA 2 60 0% 1 800 7% 13 856 64% 10 726 93% 4 054
RA 2 100 0% 3 000 13% 15 105 79% 9 321 100% 5 256
RA 2 500 0% 15 000 8% 26 663 96% 18 566 100% 17 306
RA 2 5000 1% 150 000 5% 160 487 100% 152 868 100% 152 329
RO 10 60 0% 1 800 0% 15 806 99% 326 193 96% 133 508
RO 10 100 0% 3 000 0% 16 979 99% 328 651 100% 133 634
RO 10 500 0% 15 000 0% 29 006 100% 341 868 100% 146 056
RO 10 5000 0% 150 000 0% 163 858 100% 477 806 100% 281 391
WO 2 60 0% 1 800 0% 14 399 100% 21 077 100% 6 467
WO 2 100 0% 3 000 0% 15 090 100% 22 260 100% 7 567
WO 2 500 0% 15 000 0% 27 260 100% 34 235 100% 19 385
WO 2 5000 0% 150 000 0% 160 978 100% 169 121 100% 154 137
the test functions depends on its dimension: SS = 60 for Dim = 1 or 2, 100 for
Dim = 3, 500 for Dim = 4 or 5, 1000 for Dim = 6 and 2000 for Dim = 10.
The Tables 4 and 5 summarize the results (i.e. SR, SD, CPUT and NEvalF)
obtained from the 100 runs of the six algorithms, for the 21 benchmark functions.
Table 4. Comparison between methods: GA, GANM and RF
Num FTest Dim GA GANM RF

SR SD CPUT NEvalF SR SD CPUT NEvalF SR SD CPUT NEvalF
1 BO1 2 52% 5,09E-04 0,22 19 653 100% 4,97E-11 0,29 6 422 0% 2,66E-01 0,01 1 800
2 BO2 2 24% 1,07E-01 0,23 20 562 100% 5,72E-11 0,29 6 359 0% 1,74E-01 0,01 1 800
3 BR 2 50% 6,22E-04 0,25 22 138 100% 5,11E-08 4,49 86 543 0% 6,27E-02 0,01 1 800
4 CA 2 93% 1,00E-04 0,24 20 489 98% 1,11E-01 3,39 64 233 1% 4,65E-02 0,01 1 800
5 CO 4 100% 1,16E-05 0,30 23 818 29% 2,46E-01 1,87 35 774 0% 1,05E-01 0,10 15 000
6 DE 3 99% 2,30E-05 0,27 23 214 100% 2,76E-12 0,36 8 530 0% 1,02E-01 0,02 3 000
7 GO 2 62% 2,23E-04 0,33 22 190 98% 1,27E+00 2,60 39 874 0% 3,79E-01 0,01 1 800
8 GR 5 0% 1,87E-01 0,33 23 998 93% 4,17E-02 24,97 376 071 0% 2,74E+01 0,14 15 000
9 HN 2 81% 5,06E-02 0,27 21 712 52% 2,62E-01 4,69 80 671 1% 5,52E-02 0,02 1 800
10 HR3 3 100% 1,58E-06 0,31 22 829 100% 6,43E-08 9,22 131 457 0% 4,22E-03 0,02 3 000
11 HR6 6 66% 1,71E-02 0,33 24 012 98% 5,05E-03 32,29 397 117 0% 2,02E-02 0,27 30 000
12 RA 2 27% 5,05E-03 0,22 19 169 51% 1,24E+00 1,35 26 337 0% 6,81E-01 0,01 1 800
13 RO 10 0% 3,45E+01 0,33 24 012 100% 7,88E-12 44,78 683 973 0% 3,69E+00 2,69 60 000
14 SH1 1 99% 1,97E-05 0,16 17 519 96% 6,66E-02 1,99 41 092 67% 1,38E-04 0,03 1 800
15 SH2 2 95% 3,81E-05 0,26 20 779 62% 2,69E-01 4,77 81 145 0% 5,03E-02 0,02 1 800
16 SH3 3 69% 5,46E-02 0,32 23 017 25% 3,28E-01 9,75 149 699 0% 1,41E-01 0,05 3 000
17 SH4 4 60% 2,13E-02 0,35 23 613 4% 2,68E-01 14,88 213 782 0% 1,40E-01 0,15 15 000
18 SK10 4 27% 3,52E-01 0,80 23 812 83% 2,08E-01 68,77 218 926 0% 1,05E-01 0,20 15 000
19 SK5 4 25% 3,39E-01 0,54 23 623 96% 1,12E-01 40,24 217 139 0% 8,64E-02 0,23 15 000
A Hybrid Simplex Search for Global Optimization
20 SK7 4 21% 3,29E-01 0,65 23 726 90% 1,58E-01 51,66 217 461 0% 1,21E-01 0,23 15 000
21 WO 2 1% 4,63E-02 0,23 20 512 100% 0,00E+00 2,04 39 624 0% 3,14E+00 0,02 1 800
11
12
Table 5. Comparison between methods: RFGA, RFGANM and RFNM
Num FTest Dim RFGA RFGANM RFNM

SR SD CPUT NEvalF SR SD CPUT NEvalF SR SD CPUT NEvalF
1 BO1 2 19% 6,64E-03 0,23 13 842 82% 1,65E-01 0,29 7 460 99% 4,13E-02 0,15 4 349
2 BO2 2 10% 1,09E-01 0,23 14 320 83% 8,88E-02 0,39 9 338 100% 4,49E-11 0,15 4 421
3 BR 2 27% 2,63E-03 0,22 13 463 100% 5,11E-08 2,19 43 712 100% 5,11E-08 0,14 4 202
H. Zidani et al.
4 CA 2 68% 2,99E-04 0,23 13 957 100% 3,72E-08 1,65 32 760 100% 3,72E-08 0,15 4 338
5 CO 4 43% 2,20E-04 0,36 28 507 100% 6,88E-08 0,33 20 130 100% 6,88E-08 0,53 22 540
6 DE 3 78% 5,48E-05 0,25 16 260 100% 4,99E-12 0,19 7 073 100% 2,25E-12 0,29 8 271
7 GO 2 25% 1,11E-03 0,32 14 649 100% 0,00E+00 1,27 20 873 100% 0,00E+00 0,17 4 212
8 GR 5 0% 2,41E-01 0,43 28 872 49% 2,04E-01 12,12 196 576 90% 5,05E-02 1,77 40 228
9 HN 2 63% 1,76E-02 0,27 14 338 100% 6,37E-08 2,33 41 534 100% 6,37E-08 0,16 4 008
10 HR3 3 100% 1,34E-05 0,30 16 290 100% 6,43E-08 4,37 65 699 100% 6,43E-08 0,33 7 570
11 HR6 6 35% 1,74E-02 0,58 43 874 73% 1,60E-02 16,81 231 152 100% 2,00E-08 1,76 48 337
12 RA 2 7% 9,79E-03 0,22 13 856 64% 5,03E-01 0,46 10 726 93% 2,55E-01 0,13 4 054
13 RO 10 0% 8,02E-01 2,98 73 994 99% 3,99E-01 23,75 386 100 98% 2,41E-03 11,24 190 569
14 SH1 1 88% 9,12E-05 0,15 9 942 100% 4,91E-08 1,00 21 925 100% 4,91E-08 0,08 2 756
15 SH2 2 51% 4,18E-04 0,27 14 308 100% 3,48E-08 2,29 40 642 100% 3,48E-08 0,16 4 047
16 SH3 3 36% 3,66E-02 0,33 16 512 93% 7,48E-02 4,72 75 502 99% 3,38E-02 0,29 6 771
17 SH4 4 8% 7,18E-02 0,46 28 885 74% 1,38E-01 7,27 117 804 83% 1,05E-01 0,56 21 010
18 SK10 4 4% 2,57E-01 0,96 28 783 91% 1,51E-01 34,07 121 826 100% 5,64E-08 2,16 21 198
19 SK5 4 3% 2,57E-01 0,74 28 858 83% 1,88E-01 20,33 122 398 100% 0,00E+00 1,37 21 129
20 SK7 4 2% 2,72E-01 0,84 28 693 91% 1,51E-01 25,71 121 771 100% 7,07E-08 1,70 21 206
21 WO 2 0% 1,17E-01 0,24 14 399 100% 0,00E+00 1,01 21 077 100% 0,00E+00 0,26 6 467
From Tables 4 and 5, we notice that:
– The success rates for GA are generally modest (4% to 70% and SR=0% for
GR and RO), except in the case of SH1 and HR3 (80% and 90% respectively).
– The results are improved for GANM, for witch the success rate is 100% for 7
test functions, and SR ≥ 90% for 6 test functions. This results are similar to
those obtained in [1].
– The representation formula RF has failed for almost all tests, except for SH1
where SR = 67%(SR = 100% for an accuracy of 10−3 ). We notice that the
number of function evaluation increased considerably.
– The accuracy of RFGA is lower than GANM, with a larger number of function
evaluations. The total success is obtained only for 5 functions (SR ≥ 99%).
The number of function evaluation is similar to the RFGA method.
– In view of the algorithm’s effectiveness and efficiency on the whole, RFGANM
hybrid approach remains the most competitive to RFNM than the other
methods. Indeed, the SR is 100% for 12 test functions, and SR ≥ 82% for 18
functions (SR = 100% for P opGA = 50, see Table 2 for all the test functions).
The number of function evaluation is similar to the RFGA method.
– Concerning the RFNM, the experimental data obtained from the 21 test
functions shows high accuracy for all the test problems, with a 100% rate
of successful performance on all examples, with smaller number of function
evaluations than RFGANM and RFGA.
4.5 Comparison with Other Methods
In this section, the experiments are aimed to compare the performance of RFNM
against five global optimization methods listed below (Table 6). In order to make
the results comparable, all the conditions are set to the same values (100 runs
and the accuracy of SR is set to 10−4 ).
In previous tests, we used the same parameters for all test functions. The
results showed that the RFNM method is robust (SR = 100%). To compare
it to the algorithms listed in the table, we took appropriate settings for each
function.
Table 6. Comparison with other methods, list of methods
Methods References
Representation Formula with Nelder Mead (RFNM) This work
Enhanced Continuous Tabu Search (ECTS) [2]
Staged Continuous Tabu Search (SCTS) [7]
Continuous Hybrid Algorithm (CHA) [7]
Differential Evolution (DE) [7]
LPτ NM Algorithm (LPτ NM) [7]
14 H. Zidani et al.
Table 7. Comparison with other methods in terms of success rate and number of
function evaluations
TestF(Dim) LPτ NM SCTS CHA DE RFNM

NEvalF(SR) NEvalF(SR) NEvalF(SR) NEvalF(SR) NEvalF(SR)
SH2(2) 303(85%) 370(100%) 345(100%) 4498(95%) 677(100%)
GO(2) 182(100%) 231(100%) 259(100%) 595(100%) 286(100%)
BR(2) 247(100%) 245(100%) 295(100%) 807(100%) 165(100%)
HR3(3) 292(100%) 548(100%) 492(100%) 679(100%) 312(100%)
SK10(4) 1079(96%) 898(75%) 635(85%) -(-%) 1680(100%)
SK7(4) 837(100%) 910(80%) 620(85%) 3064(100%) 1059(100%)
SK5(4) 839(100%) 825(75%) 698(85%) 3920(100%) 874(100%)
HR6(6) 1552(100%) 1520(100%) 930(100%) -(-%) 2551(100%)
RO(10) 9188(88%) 15720(85%) 14532(83%) 54134(100%) 42572(100%)
Table 7 shows that RFNM performs better then the other algorithms for three
functions (BR, SK10, RO) with a with a success rate of 100, and with some
additional number of function evaluation in some other cases.
5 Conclusion
In this paper, we proposed a new approach based on a representation formula, for

solving global optimization problems. Simulated experiments for the optimiza-
tion of nonlinear multimodal and nondifferentiable functions using this represen-
tation formula, hybridized with Genetic algorithm and Nelder Mead algorithms,
showed that RFNM is superior to the other methods in the success of finding
the global optimum. the RFGANM remains robust for high value of population
size in GA, with a larger number of evaluating functions.
Extensive experimentations concerning the effect of different parameters have
been performed: Influence of the choice of the probability distribution, influence
of the Pincus function, influence of the population size for GA, and influence of
the sample size for RF. A comparison with other algorithms suggested in the
literature shows that the RFNM is in general much more superior in efficiency.
Further research will include other stochastic algorithms such as particle swarm
optimization algorithm with the representation formula.
References
1. Chelouah, R., Siarry, P.: Genetic and Nelder-Mead algorithms hybridized for a more
accurate global optimization of continuous multiminima functions. Eur. J. Oper.
Res. 148(2), 335–348 (2003)
2. Chelouah, R., Siarry, P.: A hybrid method combining continuous tabu search and
nelder-mead simplex algorithms for the global optimiza tion of multiminima func-
tions. Eur. J. Oper. Res. 161(3), 636–654 (2005)
3. Souza de Cursi, J.: Representation of solutions in variational calculus. In: Variational

Formulations in Mechanics: Theory and Applications, pp. 87–106 (2007)
4. Souza de Cursi, J., El Hami, A.: Representation of solutions in continuous optimiza-
tion. Int. J. Simul. Multidiscip. Design Optim. (2009)
5. Floudas, C.A., Pardalos, P.M. (eds.).: Encyclopedia of Optimization. Springer,
Boston (2009)
6. Gaviano, M., Kvasov, D., Lera, D., Sergeyev, Y.D.: Software for generation of classes
of test functions with known local and global minima for global optimization. ACM
Trans. Math. Softw. 29(4), 469–480 (2003)
7. Georgievaa, A., Jordanov, I.: A hybridmeta-heuristic for global optimisation using
low-discrepancy sequences of points. Comput. Oper. Res. 37(3), 456–469 (2010)
8. Ivorra, B., Mohammadi, B., Ramos, A.M., Redont, I.: Optimizing initial guesses
to improve global minimization. In: Pre-Publication of the Department of Applied
Mathematics MA-UCM-UCM-No 2008-06 - Universidad Complutense de Madrid, 3
Plaza de Ciencias, 28040, Madrid, Spain. p. 17 (2008)
9. Pincus, M.: A closed formula solution of certain programming problems. Oper. Res.
16(3), 690–694 (1968)
A Population-Based Stochastic
Coordinate Descent Method
Ana Maria A. C. Rocha1,2(B) , M. Fernanda P. Costa3,4 , and

Edite M. G. P. Fernandes1
1
ALGORITMI Center, University of Minho, 4710-057 Braga, Portugal
{arocha,emgpf}@dps.uminho.pt
2
Department of Production and Systems, University of Minho, Braga, Portugal
3
Centre of Mathematics, University of Minho, 4710-057 Braga, Portugal
mfc@math.uminho.pt
4
Department of Mathematics, 4800-058 Guimarães, Portugal
Abstract. This paper addresses the problem of solving a bound con-

strained global optimization problem by a population-based stochastic
coordinate descent method. To improve efficiency, a small subpopula-
tion of points is randomly selected from the original population, at each
iteration. The coordinate descent directions are based on the gradient
computed at a special point of the subpopulation. This point could be
the best point, the center point or the point with highest score. Prelimi-
nary numerical experiments are carried out to compare the performance
of the tested variants. Based on the results obtained with the selected
problems, we may conclude that the variants based on the point with
highest score are more robust and the variants based on the best point
less robust, although they win on efficiency but only for the simpler and
easy to solve problems.
Keywords: Global optimization · Stochastic coordinate descent
1 Introduction
The optimization methods for solving problems that have big size of data, like
large-scale machine learning, can make use of classical gradient-based meth-
ods, namely the full gradient, accelerated gradient and the conjugate gradient,
classified as batch approaches [1]. Using intuitive schemes to reduce the infor-
mation data, the stochastic gradient approaches have shown to be more efficient
than the batch methods. An appropriate approach to solve this type of prob-
lems is through coordinate descent methods. Despite the fact that they were the
first optimization methods to appear in the literature, they have received much
attention recently. Although the global optimization (GO) problem addressed
in this paper has not a big size of data, the herein proposed solution method
is iterative, stochastic and relies on a population of candidate solutions at each
iteration. Thus, a large amount of calculations may be required at each itera-
tion. To improve efficiency, we borough some of the ideas that are present in
https://doi.org/10.1007/978-3-030-21803-4_2
Population Based Stochastic Method 17
machine learning techniques and propose a population-based stochastic coordi-

nate descent method. This paper comes in the sequence of the work presented
in [2].
We consider the problem of finding a global solution of a bound constrained
nonlinear optimization problem in the following form:
min f (x)
(1)
subject to x ∈ Ω,
where f : Rn → R is a nonlinear function and Ω = {x ∈ Rn : −∞ < li ≤

xi ≤ ui < ∞, i = 1, . . . , n} is a bounded feasible region. We assume that the
objective function f is differentiable, nonconvex and may possess many local
minima in the set Ω. We assume that the optimal set X ∗ of the problem (1) is
nonempty and bounded, x∗ is a global minimizer and f ∗ represents the global
optimal value. To solve the GO problem (1), a stochastic or a deterministic
method may be selected. A stochastic method provides a solution, in general in
a short CPU time, although it may not be globally optimal. On the other hand,
a deterministic method is able to compute an interval that contains the global
optimal solution, but requires a much larger computational effort [3]. To generate
good solutions with less computational effort and time, approximate methods
or heuristics may be used. Some heuristics use random procedures to generate
candidate solutions and perform a series of operations on those solutions in order
to find different and hopefully better solutions. They are known as stochastic
heuristics. A method for GO has two main goals. One intends to explore the
search domain for the region where the global optimal solution lies, the other
intensifies the search in a vicinity of a promising region in order to compute a
high quality approximation.
This paper aims to present a practical study involving several variants of a
population-based stochastic method for solving the GO problem (1). Since our
goal is to make the method robust and as efficient as possible, a strategy based
on coordinate descent directions is applied. Although a population of candidate
solutions/points of large size is initially generated, only a very small subset of
those points is randomly selected, at each iteration – henceforward denoted as
subpopulation – to provide an appropriate approximation, at the end of each
iteration. Since robustness of the method is to be privileged, the point of each
subpopulation that is used to define the search direction to move each point
of the subpopulation is carefully chosen in order to potentiate both exploration
and exploitation abilities of the method. The point with the highest score of the
subpopulation is proposed. A comparison with the best and the center points is
also carried out.
This paper is organized as follows. Section 2 briefly presents the coordinate
descent method and Sect. 3 describes the herein proposed stochastic coordinate
descent method when applied to a population of points. Finally, Sect. 4 contains
the results of our preliminary numerical experiments and we conclude the paper
with the Sect. 5.
18 A. M. A. C. Rocha et al.
2 Coordinate Descent Method

This section briefly presents the coordinate descent method (CDM) and its
stochastic variant. The CDM operates by taking steps along the coordinate direc-
tions [1,4]. Hence, the search direction for minimizing f from the iterate xk , at
iteration k, is defined as
dk = −∇ik f (xk )eik (2)
where ∇ik f (·) represents the component ik of the gradient of f , eik represents
the ik th coordinate vector for some index ik , usually chosen by cycling through
{1, 2, . . . , n}, and xik is the ik th component of the vector x ∈ Rn . For a positive
step size, αk , the new approximation, xk+1 , differs from xk only in the ik th
component and is computed by xk+1 = xk + αk dk . Note that the direction
shown in (2) might not be a negative directional derivative for f at xk . When
the index ik to define the search direction and the component of xk to be adjusted
is chosen randomly by Uniform distribution (U) on {1, 2, . . . , n}, with or without
replacement, the CDM is known as a stochastic CDM. This type of method has
attracted the attention of the scientific community because of their usefulness
in data analysis and machine learning. Applications are varied, in particular in
support vector machine problems [5].
3 A Population-Based Stochastic Coordinate Descent

Method
At each iteration of a population-based algorithm, a set of points is generated
aiming to explore the feasible region for a global optimum. Let |P | denote the
number of points in the population, where xi ∈ Rn represents the point with
index i of the population, where i ∈ P = {1, 2, . . . , |P |}. The likelihood is that the
greater the |P | the better is the exploration feature of the algorithm. However,
to handle and evaluate the objective f for a large number of points is time
consuming.
In order to improve the efficiency of the method, the number of function
evaluations must be reduced. Thus, the method is based on a random selection
of points from the original population, at each iteration k – herein designated
by the subpopulation k. Thus, at each iteration, a subpopulation of points (of
small size) is selected to be evaluated and potentially moved in direction to the
global optimum. This random selection uses the U either with or without replace-
ment to select the indices for the subpopulation from the set {1, 2, . . . , |P |}. Let
P1 , P2 , . . . , Pk , . . . be the sets of indices of the subpopulation randomly chosen
from P .
At each iteration k there is a special point that is maintained for the next
iteration. This point is the best point of the current subpopulation. This way,
for k > 1, the randomly selected set Pk does not include the index of the best
point from the previous subpopulation and the size of the subpopulation k is
|Pk | + 1. We note that the size of the subpopulation at the first iteration is |P1 |.
The subsets of indices when generating the subpopulation satisfy the following
conditions: (i) P1 ⊂ P and Pk+1 ⊂ P \{kb } for k ≥ 1; (ii) |P1 | |P |; (iii)

|P2 | + 1 ≤ |P1 | and |Pk+1 | ≤ |Pk | for k > 1; where kb is the index of the best
point of the subpopulation k. Onwards P1+ = P1 and Pk+1 +
= Pk+1 ∪ {kb } for
k ≥ 1 are used for simplicity [2].
We now show how each point xkj (j = 1, . . . , |Pk+ |) of the subpopulation k is
moved. For each point, a search direction is generated. Thus, the point xkj may
be moved along the direction, dkj , as follows:
xkj = xkj + αkj dkj (3)
where 0 < αkj ≤ 1 is the step length computed by a backtracking strategy.

The direction dkj used to move the point xkj is defined by
dkj = −∇i f (xkH )ei (4)
where ei represents the ith coordinate vector for some index i, randomly selected
from the set {1, 2, . . . , n}. We note that the search direction is along a component
of the gradient computed at a special point of the subpopulation k, xkH , further
on denoted by the point with the highest score. Since dkj might not be a descent
direction for f at xkj , the movement according to (3) is applied only if dkj is
descent for f at xkj . Otherwise, the point xkj is not moved. Whenever the new
position of the point falls outside the bounds, a projection onto Ω is carried out.
The index of the point with highest score kH , at iteration k, satisfies
kH = arg max s(xkj ) where s(xki ) = D̂(xki ) − fˆ(xki ) (5)

j=1,...,|Pk+ |
is the score of the point xki [6]. The normalized distance D̂(xki ), from xki to
the center point of the k subpopulation, and the normalized objective function
value fˆ(xki ) at xki are defined by
D(xki ) − minj=1,...,|P + | D(xkj )

k
D̂(xki ) = (6)
maxj=1,...,|P + | D(xkj ) − minj=1,...,|P + | D(xkj )
k k
and
f (xki ) − minj=1,...,|P + | f (xkj )
fˆ(xki ) = k
(7)
maxj=1,...,|P + | f (xkj ) − minj=1,...,|P + | f (xkj )
k k
respectively. The distance function D(xki ) (to the center point x̄k ) is measured
by xki − x̄k 2 and the center point is evaluated as follows:
+
|Pk |
1
x̄k = + xkj . (8)
|Pk | j=1
We note here that the point with the highest score in each subpopulation is
the point that lies far away from the center of the region defined by its points
(translated by x̄) that has the lowest function value. This way, looking for the
largest distance to x̄, the algorithm potentiates its exploration ability, and choos-
ing the one with lowest f value, the algorithm reenforces its local exploitation
capability. For each point with index kj , j = 1, . . . , |Pk+ |, the gradient coordinate
index i may be randomly selected by U on the set {1, 2, . . . , n} one at a time for
each kj with replacement. However, the random choice may also be done using
U on {1, 2, . . . , n} but without replacement. In this later case, when all indices
have been chosen, the set {1, 2, . . . , n} is shuffled [5].
The stopping condition of our population-based stochastic coordinate descent
algorithm aims to guarantee a solution in the vicinity of f ∗ . Thus, if
|f (xkb ) − f ∗ | ≤ |f ∗ | + 2 , (9)
where xkb is the best point of the subpopulation k and f ∗ is the known global
optimum, is satisfied for a given tolerance > 0, the algorithm stops. Otherwise,
the algorithm runs until a specified number of function evaluations, nfmax , is
reached. The main steps of the algorithm are shown in Algorithm 1.
Randomly generate the population in Ω

repeat
Randomly select a subpopulation for iteration k and select xkH
for each point xkj in the subpopulation do
Randomly select i ∈ {1, . . . , n} to choose the component of ∇f at xkH
Compute the search direction dkj according to (4)
if dkj is descent for f at xkj then
Move xkj according to (3)
Select the best point xkb of the subpopulation
until (9) is satisfied or the number of function evaluations exceeds nfmax
Algorithm 1. Population-based stochastic coordinate descent algorithm
4 Numerical Experiments
During the preliminary numerical experiments, well-known benchmark problems

are used: BO (Booth, n = 2), BP (Branin, n = 2), CB6 (Camel6, n = 2), DA
(Dekkers & Aarts, n = 2), GP (Goldstein & Price, n = 2), HSK (Hosaki, n = 2),
MT (Matyas, n = 2), MC (McCormick, n = 2), MHB (Modified Himmelblau,
n = 2), NF2 (Neumaier2, n = 4), PWQ (Powell Quadratic, n = 4), RG-2, RG-5,
RG-10 (Rastrigin, n = 2, n = 5, n = 10) RB (Rosenbrock, n = 2), WF (Wood,
n = 4), see the full description in [7]. The MatlabTM (Matlab is a registered
trademark of the MathWorks, Inc.) programming language is used to code the
algorithm and the tested problems. The parameter values are set as follows:
|P | = 500, |P1 | = 0.01|P |, |Pk | = |P1 | − 1 for all k > 1, = 1E − 04 and
nfmax = 50000.
In our previous work [2], we have used the gradient computed at x̄. Besides
this variant, we have also tested a variant where the gradient is computed at
the best point of the subpopulation. These variants are now compared with the
new strategy based on the gradient computed at the point with highest score,
summarized in the previous section. All the tested variants are termed as follows:
– best w (best wout): gradient computed at the best point and the coordinate
index i (see (4)) is randomly selected by U with (without) replacement;
– center w (center wout): gradient computed at x̄ and the coordinate index i is
randomly selected by U with (without) replacement;
– hscore w (hscore wout): gradient computed at the point with highest score and
the coordinate index i is randomly selected by U with (without) replacement;
– best full g (center full g / hscore full g): using the full gradient computed at
the best point (x̄/the point with highest score) to define the search direction.
Each variant was run 30 times with each problem. Tables 1 and 2 show the
average of the obtained f solution values over the 30 runs, favg , the minimum f
solution value obtained after the 30 runs, fmin , the average number of function
evaluations, nfavg , and the percentage of successful runs, %s, for the variants
best w, center w, hscore w and best wout, center wout, hscore wout. A successful
run is a run which stops with the stopping condition for the specified , see
(9). The other statistics also reported in the tables are: (i) the % of problems
with 100% of successful runs (% prob 100%); (ii) the average nf in problems
with 100% of successful runs (nfavg 100%); (iii) average nf in problems with
100% of successful runs simultaneously in the 3 tested variants (for each table)
(nfavg all100%). A result printed in ‘bold’ refers to the best variant shown and
compared in that particular table. From the results, we may conclude that using
with or without replacement to choose the coordinate index i (see (4)) has no
influence on the robustness and efficiency of the variant based on the gradient
computed at the best point. Variants best w and best wout are the less robust
and variants center w, hscore w and hscore wout are the most robust.
When computing the average number of function evaluations for the problems
that have 100% of successful runs in all the 3 tested variants, best w wins, fol-
lowed by hscore w and then by center w (same is true for best wout, hscore wout
and center wout). We remark that these average number of evaluations corre-
spond to the simpler and easy to solve problems. For the most difficult problems
and yet larger problems, the variants hscore w (75% against 50% and 69%) and
hscore wout (69% against 50% and 63%) win as far as robustness is concerned.
This justifies their larger nfavg 100% values.
The results reported in Table 3 aim to show that robustness has not been
improved when the full gradient is used. All the values and statistics have the
same meaning as in the previous tables. Similarly, the variant based on gra-
dient computed at the best point reports the lowest nfavg all100% but also
reaches the lowest % prob 100%. The use of the full gradient has deterio-
rated the results mostly on the variant center full g when compared with both
center w and center wout.
22
Table 1. Results based on the use of one coordinate of the gradient, randomly selected with replacement.
best w center w hscore w

favg fmin nfavg % s favg fmin nfavg % s favg fmin nfavg %s
BO 6.772E-09 4.354E-09 495 100 6.616E-09 1.496E-11 2082 100 6.543E-09 2.061E-09 1555 100
BP 3.979E-01 3.979E-01 96 100 3.979E-01 3.979E-01 681 100 3.979E-01 3.979E-01 239 100
CB6 −1.032E+00 -1.032E+00 935 100 −1.032E+00 −1.032E+00 385 100 −1.032E+00 −1.032E+00 512 100
DA −2.478E+04 −2.478E+04 786 100 −2.478E+04 −2.478E+04 1251 100 −2.478E+04 −2.478E+04 1020 100
A. M. A. C. Rocha et al.
GP 3.000E+00 3.000E+00 828 100 3.000E+00 3.000E+00 1262 100 3.000E+00 3.000E+00 1564 100
HSK −2.346E+00 −2.346E+00 81 100 −2.346E+00 −2.346E+00 305 100 −2.346E+00 −2.346E+00 110 100
MT 9.652E-09 9.006E-09 1542 100 8.650E-09 5.902E-09 2255 100 8.556E-09 1.157E-09 2159 100
MC −1.913E+00 −1.913E+00 93 100 −1.913E+00 −1.913E+00 318 100 −1.913E+00 −1.913E+00 172 100
MHB 3.510E-01 2.662E-10 12144 77 5.525E-09 7.487E-10 1721 100 4.254E-09 5.141E-10 1450 100
NF2 3.728E-03 3.690E-05 50009 0 1.023E-02 6.221E-06 50020 0 4.403E-03 6.601E-05 50020 0
PWQ 6.514E-03 1.936E-07 50013 0 5.965E-03 5.386E-06 50019 0 6.655E-03 1.723E-05 50021 0
RG-2 4.643E-01 3.165E-09 18947 63 4.568E-09 1.868E-11 1505 100 4.160E-09 1.600E-10 2074 100
RG-5 3.283E+00 5.984E-09 43593 13 4.026E-09 8.058E-12 5918 100 3.855E-09 3.368E-12 6981 100
RG-10 5.373E+00 1.990E+00 50007 0 3.317E-02 1.994E-10 13911 97 3.157E-09 2.200E-11 20202 100
RB 5.346E-04 3.505E-08 50008 0 6.533E-03 1.224E-06 50027 0 5.449E-03 2.052E-07 50023 0
WF 8.587E-04 1.706E-05 50012 0 1.490E-01 6.966E-06 50018 0 2.425E-01 4.423E-03 50021 0
% prob 100% 50 69 75
nfavg 100% 607 1607 3170
nfavg all100% 607 1067 916
Table 2. Results based on the use of one coordinate of the gradient, randomly selected without replacement.
best wout center wout hscore wout

favg fmin nfavg % s favg fmin nfavg % s favg fmin nfavg % s
BO 7.221E-09 4.119E-09 521 100 6.865E-09 3.723E-10 1723 100 6.306E-09 9.252E-10 1641 100
BP 3.979E-01 3.979E-01 126 100 3.979E-01 3.979E-01 794 100 3.979E-01 3.979E-01 196 100
CB6 −1.032E+00 −1.032E+00 883 100 −1.032E+00 −1.032E+00 358 100 -1.032E+00 −1.032E+00 487 100
DA −2.478E+04 −2.478E+04 753 100 −2.478E+04 −2.478E+04 1272 100 −2.478E+04 −2.478E+04 1032 100
GP 3.000E+00 3.000E+00 904 100 3.000E+00 3.000E+00 1375 100 3.000E+00 3.000E+00 1518 100
HSK −2.346E+00 −2.346E+00 69 100 −2.346E+00 −2.346E+00 295 100 −2.346E+00 −2.346E+00 113 100
MT 9.701E-09 9.103E-09 1500 100 8.286E-09 4.319E-09 2190 100 8.199E-09 2.762E-09 2127 100
MC −1.913E+00 −1.913E+00 103 100 −1.913E+00 −1.913E+00 300 100 −1.913E+00 −1.913E+00 154 100
MHB 3.168E-01 8.286E-11 9682 83 5.692E-09 1.428E-10 1975 100 3.787E-09 2.452E-10 1357 100
NF2 3.016E-03 4.736E-05 50010 0 1.033E-02 7.602E-05 50017 0 5.700E-03 8.673E-05 50024 0
PWQ 6.014E-03 1.635E-05 50014 0 5.501E-03 9.561E-07 50025 0 5.293E-03 1.100E-06 50026 0
RG-2 4.975E-01 3.109E-09 22302 57 3.245E-09 4.320E-11 1846 100 4.137E-09 3.149E-11 2082 100
RG-5 1.957E+00 4.262E-09 41991 17 3.317E-02 3.006E-12 7943 97 3.762E-09 2.160E-11 7759 100
RG-10 6.567E+00 7.386E-09 46919 7 3.317E-02 3.264E-11 15331 97 5.592E-08 1.516E-11 22677 97
RB 2.924E-04 9.952E-09 49438 7 6.361E-03 9.378E-07 50028 0 5.017E-03 1.505E-08 50021 0
WF 9.078E-04 3.213E-06 50008 0 1.287E-01 7.751E-05 50019 0 1.794E-01 1.618E-04 50027 0
% prob 100% 50 63 69
nfavg 100% 607 1213 1679
Population Based Stochastic Method
nfavg all100% 607 1038 908

23
Table 3. Results based on the use of the full gradient.
best full g center full g hscore full g

favg nfavg % s favg nfavg % s favg nfavg %s
BO 6.331E-09 208 100 5.862E-09 4841 100 5.397E-09 883 100
BP 3.979E-01 185 100 3.979E-01 2142 100 3.979E-01 294 100
CB6 −1.032E+00 116 100 −1.032E+00 465 100 −1.032E+00 558 100
DA −2.477E+04 3398 100 −2.477E+04 33403 77 −2.477E+04 4220 100
GP 3.000E+00 658 100 3.000E+00 1701 100 3.000E+00 1235 100
HSK −2.346E+00 55 100 −2.346E+00 636 100 −2.346E+00 72 100
MT 9.615E-09 756 100 5.753E-09 5153 100 4.442E-09 709 100
MC −1.913E+00 41 100 −1.913E+00 1320 100 −1.913E+00 78 100
MHB 5.123E-01 10005 83 5.116E-09 2743 100 4.665E-09 1526 100
NF2 6.756E-03 50009 0 3.470E-02 50014 0 9.037E-03 50027 0
PWQ 1.082E-02 50011 0 6.890E-02 50020 0 5.892E-03 50020 0
RG-2 1.194E+00 39004 23 5.076E-09 6722 100 5.181E-09 6118 100
RG-5 1.718E+01 50013 0 4.245E+00 50018 0 4.669E+00 50015 0
RG-10 6.179E+01 50016 0 3.333E+01 50023 0 3.254E+01 50019 0
RB 5.162E-03 50006 0 2.809E-05 44037 47 3.643E-04 47933 17
WF 4.247E-01 50006 0 3.000E-01 50019 0 7.109E-01 50024 0
% prob 100% 50 56 63
nfavg 100% 677 2858 1569
nfavg all100% 288 2323 547
Table 4 compares the results obtained with five of the above mentioned prob-
lems with those presented in [2]. The comparison involves the three tested vari-
ants center w, hscore w and hscore wout, which provided the highest percentages
of successful runs, 69%, 75% and 69% respectively. This table reports the values
of favg and nfavg , after 30 runs. We note that the herein stopping condition is
the same as that of [2]. All reported variants have 100% of successful runs when
solving GP, MHB, RG-2 and RG-5. However, only the variant hscore w reaches
100% success when solving RG-10 (see last row in the table).
Table 4. Comparative results.
Results in [2] center w hscore w hscore wout

favg nfavg favg nfavg favg nfavg favg nfavg
GP 3.00E+00 833 3.00E+00 1262 3.00E+00 1564 3.00E+00 1518
MHB 5.10E-09 1229 5.53E-09 1721 4.25E-09 1450 3.79E-09 1357
RG-2 3.40E-09 1502 4.57E-09 1505 4.16E-09 2074 4.14E-09 2082
RG-5 3.52E-09 13576 4.03E-09 5918 3.86E-09 6981 3.76E-09 7759
RG-10 2.65E-01 30104 3.32E-02 13911 3.16E-09 20202 5.59E-08 22677
(% s) (77) (97) (100) (97)
5 Conclusions
In this paper, we present a population-based stochastic coordinate descent
method for bound constrained GO problems. Several variants are compared in
order to find the most robust, specially when difficult and larger problems are
considered. The idea of using the point with highest score to generate the coor-
dinate descent directions to move all the points of the subpopulation has shown
to be more robust than the other tested ideas and worth pursuing.
Future work will be directed to include, in the set of tested problems,
instances with varied dimensions to analyze the influence of the dimension n
in the performance of the algorithm. Another matter is related to choosing a
specified number (yet small) of gradient coordinate indices (rather than just
one) by the uniform distribution on the set {1, 2, . . . , n}, to move each point of
the subpopulation.
Acknowledgments. This work has been supported by FCT – Fundação para

a Ciência e Tecnologia within the Projects Scope: UID/CEC/00319/2019 and
UID/MAT/00013/2013.
References
1. Bottou, L., Curtis, F.E., Nocedal, J.: Optimization methods for large-scale machine
learning. Technical Report arXiv:1606.04838v3, Computer Sciences Department,
University of Wisconsin-Madison (2018)
2. Rocha, A.M.A.C., Costa, M.F.P., Fernandes, E.M.G.P.: A stochastic coordinate
descent for bound constrained global optimization. AIP Conf. Proc. 2070, 020014
(2019)
3. Kvasov, D.E., Mukhametzhanov, M.S.: Metaheuristic vs. deterministic global opti-
mization algorithms: the univariate case. Appl. Math. Comput. 318, 245–259 (2018)
4. Nesterov, Y.: Efficiency of coordinate descent methods on huge-scale optimization
problems. SIAM J. Optim. 22(2), 341–362 (2012)
5. Wright, S.J.: Coordinate descent algorithms. Math. Program. Series B 151(1), 3–34
(2015)
6. Liu, H., Xu, S., Chen, X., Wang, X., Ma, Q.: Constrained global optimization via a
DIRECT-type constraint-handling technique and an adaptive metamodeling strat-
egy. Struct. Multidisc. Optim. 55(1), 155–177 (2017)
7. Ali, M.M., Khompatraporn, C., Zabinsky, Z.B.: A numerical evaluation of several
stochastic algorithms on selected continuous global optimization test problems. J.
Glob. Optim. 31(4), 635–672 (2005)
A Sequential Linear Programming
Algorithm for Continuous
and Mixed-Integer Nonconvex Quadratic
Programming
Mohand Bentobache(B) , Mohamed Telli, and Abdelkader Mokhtari
Laboratory of Pure and Applied Mathematics, University Amar Telidji of Laghouat,

BP 37G, Ghardaı̈a Road, 03000 Laghouat, Algeria
m.bentobache@lagh-univ.dz, mohamed.telli@yahoo.com,
abedelkadermokhtari@gmail.com
Abstract. In this work, we propose a new approach called “Sequen-

tial Linear Programming (SLP) algorithm” for finding an approximate
global minimum of continuous and mixed-integer nonconvex quadratic
programs (qps). In order to compare our algorithm with the exist-
ing approaches, we have developed an implementation with MATLAB
and we presented some numerical experiments which compare the per-
formance of our algorithm with the branch and cut algorithm imple-
mented in CPLEX12.8 on 28 concave quadratic test problems, 64 non-
convex quadratic test problems and 12 mixed-integer nonconvex qps. The
numerical results show that our algorithm has successfully found similar
global objective values as CPLEX12.8 in almost all the considered test
problems and it is competitive with CPLEX12.8, particularly in solving
large problems (number of variables greater that 50 and less than 1000).
Keywords: Concave quadratic programming · Nonconvex quadratic

programming · Mixed-integer quadratic programming · Linear
programming · Approximate global optimum · Extreme point ·
Numerical experiments
1 Introduction
Nonconvex quadratic programming is a very important branch in optimization.
There does not exists a polynomial algorithm for finding the global optimum
of nonconvex quadratic programs, so they are considered as NP-hard optimiza-
tion problems. Several approaches were proposed for finding local optimal solu-
tions (DCA [16], interior-point methods [2], simplex algorithm for the concave
quadratic case [4], etc.) and approximate global solutions (branch and cut [18],
branch and bound [10,12,13], DC combined with branch and bound [1], integer
linear programming reformulation approaches [21], approximation set and linear
programming (LP) approach [3,5,17], etc.)
https://doi.org/10.1007/978-3-030-21803-4_3
A SLP Algorithm for Nonconvex QP 27
In [5], a new and very interesting approach based on the concept of approx-
imation set and LP for finding an approximate global solution of a strictly con-
cave quadratic program with inequality and nonnegativity constraints is pro-
posed. This approach computes a finite number of feasible points in the level
line passing through the initial feasible solution, then it solves a sequence of
linear programs. After that, the current point is improved by using the global
optimality criterion proposed in [9].
In [3,17], the previous approach is adapted and extended to solve concave
quadratic programs written in general form (the matrix of the quadratic form
is negative semi-definite, the problem can contain equality and inequality con-
straints, the bounds of the variables can take finite or infinite values). In order to
improve the current solution, the global optimality criterion proposed in [14] was
used. However, the previous global criteria [9,14] use the concavity assumption,
thus they can not be used for the general nonconvex case.
In this work, we generalize the algorithms proposed in [3,5,17] for solving
nonconvex quadratic programming problems written in general form. Hence a
new approach called “sequential linear programming algorithm” is proposed.
This algorithm starts with an initial extreme point, which is the solution of
the linear program corresponding to the minimization of the linear part of the
objective function over the feasible set of the quadratic problem, then it moves
from a current extreme point to a new one with a better objective function value
by solving a sequence of LP problems. The algorithm stops when no improvement
is possible. Our algorithm finds a good approximate global extreme point for
continuous as well as mixed-integer quadratic programs, it is easy to implement
and has a polynomial average complexity.
In order to compare our algorithm with the existing approaches, we devel-
oped an efficient implementation with MATLAB2018a [11]. Then, we presented
some numerical experiments which compare the performance of the developed
nonconvex solver (SLPqp) with the branch and cut algorithm implemented in
CPLEX12.8 [6] on a collection of 104 test problems: 64 nonconvex quadratic
test problems and 20 concave quadratic test problems of the library Globallib
[8], 8 concave test problems randomly generated with the algorithm proposed
in [15] and 12 mixed-integer concave quadratic test problems constructed using
the continuous qps [7,15], by considering 50% of their variables as integers.
This paper is organized as follows. In Sect. 2, we state the problem and recall
some definitions and results of nonconvex quadratic programming. In Sect. 3, we
describe the proposed algorithm and we illustrate it with two numerical examples
(a continuous nonconvex qp [19] and an integer concave qp [20]). In Sect. 4, we
present some numerical experiments which compare our solver with the branch
and cut solver of CPLEX12.8. Finally, we conclude the paper and give some
future works.
28 M. Bentobache et al.
2 Presentation of the Problem and Definitions

We consider the nonconvex quadratic programming problem presented in the
following general form:
1 T
min f (x) = x Dx + cT x, (1)
2
A1 x ≤ b1 , (2)
A2 x = b2 , (3)
l ≤ x ≤ u, (4)
xj ∈ Z, j = 1, 2, . . . , n1 , (5)
xj ∈ R, j = n1 + 1, . . . , n, (6)
where D is a square symmetric matrix which can be negative semi-definite or

indefinite; c, x, l, u are vectors in Rn , the components of l and u can take ±∞;
A1 is a real matrix of dimension m1 × n, A2 is a real matrix of dimension m2 × n,
b1 , b2 are vectors in Rm1 and Rm2 respectively.
• The set of n-vectors satisfying constraints (2)–(6) is called the feasible set of
the problem (1)–(6) and it is denoted by S. Any vector x ∈ S is called a feasible
solution of the problem (1)–(6).
• The vector x∗ ∈ S is called a global optimal solution of the problem (1)–(6),
if ∀x ∈ S, f (x∗ ) ≤ f (x).
• Let f be a function defined from Rn to R and z ∈ Rn . The level line of the
function f passing through z is defined by
Ef (z) (f ) = {y ∈ Rn : f (y) = f (z)}.
Let z be an initial feasible point. In [5], the authors proposed an algorithm

for solving a continuous strictly concave quadratic program, which is based on
the construction of a finite set of points y j , j = 1, 2 . . . , r belonging to the
set Ef (z) (f ) ∩ S. The following lemma [3,5,17] allows us to construct points
belonging to the set Ef (z) (f ), which are not necessarily feasible.
Lemma 1. Let h ∈ Rn , such that hT Dh = 0 and consider the real number γ
calculated as follows:
2hT (Dz + c)
γ=− .
hT Dh
Then the point yγ = z + γh ∈ Ef (z) (f ).
3 Steps of the SLP Algorithm

Let z 0 be an initial feasible point and ∇f (z 0 ) = Dz 0 + c be the gradient of
f at the point z 0 , such that ∇f (z 0 ) = 0. The scheme of the sequential linear
programming algorithm for finding an approximate global extreme point for the
nonconvex quadratic programming problem (1)–(6) is described in the following
steps:
Algorithm 1. (SLP algorithm)

Step 1. Choose a number r ∈ N∗ and set k = 0;
Step 2. Choose the (n×r)-matrix H = (hj , j = 1, 2, . . . , r), hj ∈ Rn , set J = ∅;
Step 3. Calculate the points y j : For j = 1, 2,. . . , r,
T
T 2hj (Dz k + c)
if hj Dhj = 0, then γj = − T
, y j = z k + γj hj , J = J ∪ {j};
hj Dhj
Step 4. Solve the linear programs min xT ∇f (y j ), j ∈ J.
x∈S
Let uj , j ∈ J be the optimal solutions of these LP problems;
Step 5. Calculate the index p ∈ J, such that f (up ) = min f (uj );
j∈J
Step 6. If f (up ) < f (z k ), then set k = k + 1, z k = up and go to Step 2.
Else z k is an approximate global minimizer for the problem (1)–(6).
In order to solve qp (1)–(6), we propose the following two-phase approach:
Algorithm 2. (SLPqp)
Phase I:
Step 1. Solve the linear program minx∈S cT x. Let z 0 be the obtained optimal
solution; (in [1], it is shown that a good starting point can accelerate the conver-
gence to a global solution.)
Step 2. Apply the SLP algorithm (Algorithm 1) with r = n and H = In , by
starting with the initial feasible extreme point z 0 (In is the identity matrix of
order n). Let z 1 be the obtained approximate global minimizer;
Phase II:
Step 3. Apply the SLP algorithm with r = 50n, if n < 500; r = n, if n ≥ 500;
the elements of H are randomly generated with the uniform distribution in the
interval [−1, 1], by starting with the initial point z 1 found in the first phase. The
obtained point z ∗ is an approximate global extreme point for qp (1)–(6).
Remark 1. Since the number of extreme points of the set S is finite and the
new point z k+1 satisfies f (z k+1 ) < f (z k ), SLPqp finds an approximate global
extreme point in a finite number of iterations.
Let us solve the following nonconvex quadratic programs by SLPqp.
Example 1. Consider the continuous nonconvex quadratic program [19]:
min f (x) = x1 − 10x2 + 10x3 + x8
−x21 − x22 − x23 − x24 − 7x25 − 4x26 − x27 − 2x28
+2x1 x2 + 6x1 x5 + 6x2 x5 + 2x3 x4 ,
s.t. x1 + 2x2 + x3 + x4 + x5 + x6 + x7 + x8 ≤ 8,
2x1 + x2 + x3 ≤ 9,
x3 + x4 + x5 ≤ 5,
0.5x5 + 0.5x6 + x7 + 2x8 ≤ 3,
2x2 − x3 − 0.5x4 ≤ 5,
x1 ≤ 6, xj ≥ 0, j = 1, 2, . . . , 8.
The global minimizer is x∗ = (0, 0, 0, 0, 5, 1, 0, 0)T with f (x∗ ) = −179 [19]. The
current point and its corresponding objective value at each iteration of SLPqp
are shown in the left side of Table 1.
Example 2. Consider the integer concave quadratic program [20]:
min f (x) = −5x21 + 8x1 − 3x22 + 7x2 ,

s.t. −9x1 + 5x2 ≤ 9,
x1 − 6x2 ≤ 6,
3x1 + x2 ≤ 9,
1 ≤ xj ≤ 7, xj integer, j = 1, 2.
The optimal solution is x∗ = (2, 3)T with f (x∗ ) = −10 [20]. The current point
and its corresponding objective value at each iteration of SLPqp are shown in
the right side of Table 1.
Table 1. Results of Examples 1 and 2.
Phase Iteration(k) z k f (z k ) Phase Iteration(k) z k f (z k )

I 0 (0, 3, 0, 2, 0, 0, 0, 0)T −43 I 0 (1, 1)T 7
1 (0, 1.5, 0, 0, 5, 0, 0, 0)T −147.25 1 (2, 3)T −10
2 (0, 0, 0, 0, 5, 0, 0, 0)T −175 2 (2, 3)T −10
II 3 (0, 0, 0, 0, 5, 1, 0, 0)T −179 II 3 (2, 3)T −10
4 (0, 0, 0, 0, 5, 1, 0, 0)T −179
In order to compare our algorithm (SLPqp) with the branch and cut solver
of CPLMEX12.8 (CPLEX): the “globalqpex1” function with the parameter
“optimalitytaget” set to 3, i.e., global, we developed an implementation with
MATLAB2018a. In this implementation, we used the barrier interior-point algo-
rithm of CPLEX12.8 (the “cplexlp” function with the parameter “lpmethod”
set to 4) to solve the intermediate continuous linear programs and we used the
“cplexmilp” function for solving the intermediate mixed-integer LPs. In the com-
parison, the different solvers are executed on a PC with a CORE i7-4790CPU
3.60 GHz processor, 8 G0 of RAM and Windows 10 operating system. We have
considered 104 nonconvex quadratic test problems (these qps can be downloaded
from “https://www.sciencedz.net/perso.php?id=mbentobache&p=253”):
(A) Twelve mixed-integer concave quadratic test problems obtained by con-
sidering 50% of the variables of the following problems as integers: nine qps
taken from [7] (the problems miqp7, miqp8, miqp9 are obtained by setting
in Problem 7, page 12 [7] (λi , αi ) = (1, 2), (λi , αi ) = (1, −5), (λi , αi ) = (1, 8)
respectively). The three last problems: miqp10, miqp11, miqp12 are obtained
from the first three generated continuous qps: Rosen-qp1, Rosen-qp2 and
Rosen-qp3 shown in Table 6. The results of the two solvers for these miqps
are shown in Table 2.
(B) Sixty-four nonconvex quadratic test problems of the library Globallib [8].
Results are shown in Tables 3 and 4.
(C) Twenty concave quadratic test problems of the library Globallib [8]. Results
are shown in Table 5.
(D) Eight concave quadratic test problems randomly generated with the algo-
rithm proposed in [15]. These qps are written in the form:
min 0.5xT Dx + cT x, s.t. Ax ≤ b, x ≥ 0,
with A an (n + 1) × n−matrix, c ∈ Rn , b ∈ Rn+1 , D ∈ Rn×n a symmetric

negative semi-definite matrix. See Table 6.
In the different tables, f ∗ , It1 , CP U1 , It, CP U and Error designate respectively
the approximate global value, the number of phase 1 iterations of SLPqp, the
CPU time of the phase 1 of SLPqp in seconds, the total number of iterations of
SLPqp, the total CPU time and the absolute value of the difference between the
approximate global minimum and the known global minimum.
Note that the obtained results are quite encouraging:
• Our algorithm has successfully found the same global objective values as
CPLEX in 77 test problems, a better objective values than CPLEX for 23
problems and a worse ones for 4 nonconvex qps (ex2-1-9, st-e23, st-glmp-ss1,
st-jcbpafex, see Table 3). Probably the global solution of these 4 problems is not
an extreme point.
Table 2. Mixed-integer concave quadratic test problems
QP Problem m n n1 SLPqp CPLEX

f∗ It1 CPU1 It CPU f∗ CPU
miqp1 1 page 5 1 5 3 −17,000 4 0,20 6 3,47 −17,000 0,14
miqp2 2 page 6 2 6 3 −361,500 1 0,09 2 2,08 −361,500 0,04
miqp3 3 page 7 9 13 6 −195,000 5 0,19 7 9,48 −195,000 0,03
miqp4 4 page 8 5 6 3 −14,800 1 0,07 2 3,46 −14,800 0,05
miqp5 5 page 10 11 10 5 −217,959 1 0,13 2 5,80 −217,959 0,05
miqp6 6 page 11 5 10 5 −39,000 2 0,24 4 9,65 −38,999 0,04
miqp7 7 page 12 10 20 10 −335,841 3 0,76 6 40,25 −335,841 0,58
miqp8 7 page 12 10 20 10 −615,841 2 0,47 5 37,09 −615,841 6,40
miqp9 7 page 12 10 20 10 −99,647 2 0,51 5 35,40 −99,647 0,10
miqp10 11 10 5 693,453 2 0,27 4 10,98 – >10800
miqp11 21 20 10 1447,480 2 0,65 4 29,04 – >10800
miqp12 31 30 15 3949,075 2 1,96 4 88,69 – >10800
Table 3. Nonconvex quadratic qps of Globallib [8]
No QP m n SLPqp CPLEX
f∗ It1 CPU1 It CPU f∗ CPU
1 ex2-1-1 1 5 −17,000 4 0,07 6 0,31 −17,000 0,46
2 ex2-1-10 10 20 −498345,482 4 0,10 6 1,02 −498345,482 0,11
3 ex2-1-2 2 6 −213,000 1 0,06 2 0,22 −213,000 0,03
4 ex2-1-3 9 13 −15,000 5 0,07 7 0,58 −15,000 0,10
5 ex2-1-4 5 6 −11,000 1 0,06 2 0,24 −11,000 0,06
6 ex2-1-6 5 10 −39,000 2 0,07 5 0,67 −39,000 0,07
7 ex2-1-9 1 10 0,000 1 0,06 2 0,28 −0,375 0,28
8 nemhaus 5 5 31,000 1 0,06 2 0,20 31,000 0,03
9 qp1 2 50 0,063 2 0,10 4 2,20 – > 14400
10 qp2 2 50 0,104 2 0,10 4 2,17 – > 14400
11 qp3 52 100 0,006 1 0,06 3 7,44 – > 14400
12 st-bpaf1a 10 10 −45,380 1 0,06 3 0,49 −45,380 0,07
13 st-bpaf1b 10 10 −42,963 1 0,06 3 0,50 −42,963 0,23
14 st-bpk1 6 4 −13,000 1 0,06 3 0,25 −13,000 0,07
15 st-bpk2 6 4 −13,000 1 0,06 3 0,25 −13,000 0,10
16 st-bpv2 5 4 −8,000 1 0,06 3 0,26 −7,999 0,12
17 st-bsj2 5 3 1,000 1 0,06 2 0,17 1,000 0,55
18 st-bsj3 1 6 −86768,550 5 0,07 7 0,32 −86768,550 0,03
19 st-bsj4 4 6 −70262,050 4 0,07 6 0,33 −70262,050 0,06
20 st-e22 5 2 −85,000 2 0,06 4 0,19 −85,000 0,04
21 st-e23 2 2 −0,750 1 0,06 2 0,15 −1,083 0,05
22 st-e24 4 2 8,000 1 0,06 3 0,19 8,000 0,04
23 st-e25 8 4 0,870 1 0,06 2 0,19 0,870 0,05
24 st-e26 4 2 −185,779 2 0,06 4 0,19 −185,779 0,04
25 st-fp1 1 5 −17,000 4 0,07 6 0,29 −17,000 0,05
26 st-fp2 2 6 −213,000 1 0,06 2 0,22 −213,000 0,03
27 st-fp3 10 13 −15,000 5 0,07 7 0,59 −15,000 0,06
28 st-fp4 5 6 −11,000 1 0,06 2 0,23 −11,000 0,04
29 st-fp5 11 10 −268,015 1 0,06 2 0,30 −268,015 0,06
30 st-fp6 5 10 −39,000 2 0,07 5 0,67 −39,000 0,06
31 st-glmp-fp1 8 4 10,000 1 0,06 3 0,27 10,000 0,05
32 st-glmp-fp2 9 4 7,345 1 0,06 3 0,28 7,345 0,14
33 st-glmp-fp3 8 4 −12,000 1 0,06 3 0,27 −12,000 0,04
34 st-glmp-kk90 7 5 3,000 1 0,06 3 0,31 3,000 0,05
35 st-glmp-kk92 8 4 −12,000 1 0,06 3 0,27 −12,000 0,03
36 st-glmp-kky 13 7 −2,500 1 0,06 2 0,25 −2,500 0,06
37 st-glmp-ss1 11 5 −24,000 1 0,06 3 0,31 −24,571 0,07
38 st-glmp-ss2 8 5 3,000 1 0,06 3 0,30 3,000 0,06
39 st-ht 3 2 −1,600 3 0,06 5 0,19 −1,600 0,11
40 st-iqpbk1 7 8 −621,488 3 0,07 5 0,40 −621,488 0,07
41 st-iqpbk2 7 8 −1195,226 3 0,07 5 0,40 −1195,226 0,09
42 st-jcbpaf2 13 10 −794,856 1 0,06 4 0,68 −794,856 0,07
43 st-jcbpafex 2 2 −0,750 1 0,06 2 0,15 −1,083 0,05
44 st-kr 5 2 −85,000 2 0,06 4 0,19 −85,000 0,07
45 st-pan1 4 3 −5,284 3 0,06 5 0,23 −5,284 0,05
46 st-pan2 1 5 −17,000 4 0,07 6 0,29 −17,000 0,05
47 st-ph1 5 6 −230,117 3 0,07 5 0,34 −230,117 0,05
48 st-ph10 4 2 −10,500 1 0,06 2 0,15 −10,500 0,03
Table 4. Nonconvex quadratic qps of Globallib [8]
NO QP m n SLPqp CPLEX
f∗ It1 CPU1 It CPU f ∗ CPU
49 st-ph11 4 3 −11,281 4 0,06 6 0,22 −11,281 0,04
50 st-ph12 4 3 −22,625 4 0,06 6 0,22 −22,625 0,04
51 st-ph13 10 3 −11,281 4 0,06 6 0,23 −11,281 0,04
52 st-ph14 10 3 −229,722 2 0,06 4 0,23 −229,125 0,04
53 st-ph15 4 4 −392,704 3 0,06 5 0,27 −392,704 0,06
54 st-ph2 5 6 −1028,117 3 0,07 5 0,34 −1028,117 0,04
55 st-ph20 9 3 −158,000 3 0,06 5 0,24 −158,000 0,04
56 st-ph3 5 6 −420,235 3 0,06 5 0,33 −420,235 0,05
57 st-phex 5 2 −85,000 2 0,06 4 0,19 −85,000 0,05
58 st-qpc-m0 2 2 −5,000 3 0,06 5 0,19 −5,000 0,04
59 st-qpc-m3a 10 10 −382,695 2 0,07 4 0,49 −382,695 0,03
60 st-qpk1 4 2 −3,000 1 0,06 3 0,19 −3,000 0,05
61 st-qpk2 12 6 −12,250 2 0,06 4 0,35 −12,250 0,07
62 st-qpk3 22 11 −36,000 3 0,07 5 0,59 −36,000 0,08
63 st-z 5 3 0,000 1 0,06 2 0,17 0,000 0,05
64 stat 5 3 0,000 1 0,06 2 0,17 0,000 0,05
• In terms of CPU time, CPLEX is slightly faster than SLPqp in almost all the
Globallib test problems. Except for the nonconvex problems qp1, qp2 and qp3
(see Table 3), CPLEX has failed to obtain the solution after 4 h, while SLPqp
has found an approximate global extreme point in less than 8 s.
• SLPqp outperforms CPLEX in solving all the generated test qps shown in
Table 6: our algorithm has found the known global minimum of all the gener-
ated problems with a good accuracy (2.91 × 10−11 ≤ Error ≤ 1.86 × 10−5 ).
Moreover, SLPqp solved the problem Rosen-qp3 of dimension 31 × 30 in 1.67 s,
while CPLEX found the solution in 542.39 s (9 min); SLPqp solved Rosen-qp4
of dimension 41 × 40 in 2.69 s, while CPLEX found the solution in 63120.91 s
(17.53 h); SLPqp solved Rosen-qp5 of dimension 51 × 50 in 4.26 s, while CPLEX
failed to find the solution after 238482.58 s (66.25 h). Finally, for problems of
dimension 201 × 200, 401 × 400 and 1001 × 1000, SLPqp found the global opti-
mal values in less than 5142.77 s (1.43 h), while we have broken the execution of
CPLEX after 4 h.
• Since the global optimal solution of concave quadratic programs is an extreme
point, SLPqp gives the global optimum with a good accuracy for this type of
problems.
Table 5. Concave quadratic qps of Globallib [8]
NO QP m n SLPqp CPLEX
Error It1 CPU1 It CPU Error CPU
1 ex2-1-5 11 10 4,74E-10 1 0,07 2 0,32 4,74E-10 0,23
2 ex2-1-7 10 20 1,74E-09 4 0,09 6 0,93 1,74E-09 0,17
3 ex2-1-8 0 24 0 2 0,08 4 1,02 0 0,08
4 st-fp7a 10 20 6,12E-04 2 0,08 5 1,31 6,12E-04 0,13
5 st-fp7b 10 20 6,12E-04 5 0,10 8 1,34 6,12E-04 0,13
6 st-fp7c 10 20 2,68E+03 4 0,09 7 1,32 3,18E-04 0,12
7 st-fp7d 10 20 6,12E-04 3 0,08 6 1,31 6,12E-04 0,18
8 st-fp7e 10 20 1,34E-04 4 0,09 7 1,31 1,34E-04 0,16
9 st-m1 11 20 0 1 0,07 2 0,55 8,89E-02 0,10
10 st-m2 21 30 0 1 0,07 2 0,89 9,67E-01 0,14
11 st-qpc-m1 5 5 0 2 0,06 4 0,30 4,50E-09 0,04
12 st-qpc-m3b 10 10 0 1 0,06 2 0,31 0 0,04
13 st-qpc-m3c 10 10 0 1 0,06 2 0,31 0 0,03
14 st-qpc-m4 10 10 0 1 0,06 2 0,30 0 0,04
15 st-rv1 5 10 0 2 0,07 4 0,51 1,42E-14 0,07
16 st-rv2 10 20 0 1 0,07 2 0,55 1,42E-14 0,08
17 st-rv3 20 20 0 2 0,08 4 1,02 0 0,21
18 st-rv7 20 30 0 2 0,09 4 1,50 7,87E-10 0,26
19 st-rv8 20 40 0 3 0,12 5 2,09 0 0,17
20 st-rv9 20 50 0 2 0,11 5 3,99 0 0,68
Table 6. Randomly generated concave qps [15]
QP n SLPqp CPLEX
Error It1 CPU1 It CPU Error CPU
Rosen-qp1 10 5,38E-09 2 0,07 4 0,50 3,20E-04 0,78
Rosen-qp2 20 2,91E-11 2 0,08 4 0,99 7,06E-04 5,01
Rosen-qp3 30 5,82E-11 2 0,09 4 1,66 2,61E-03 542,39
Rosen-qp4 40 7,33E-09 2 0,11 4 2,69 1,30E-05 63120,91
Rosen-qp5 50 1,16E-08 2 0,15 4 4,26 Failure 238482.58
Rosen-qp6 200 9,69E-08 1 1,97 3 274,77 – >14400
Rosen-qp7 400 1,86E-05 1 23,17 3 4782,68 – >14400
Rosen-qp8 1000 4,77E-06 1 677,29 3 5142,77 – >14400
• SLPqp has successfully found the same global optimum as CPLEX for problems
miqp1,. . . ,miqp9 (see Table 2). For test problems miqp10, miqp11 and miqp12,
our algorithm has found mixed-integer approximate global optimal solutions in
less than 89 s, while we have interrupted the execution of CPLEX after 3 h.
5 Conclusion
In this work, we have proposed a sequential linear programming algorithm to

find an approximate global extreme point for nonconvex quadratic programming
problems. This approach is easy to implement, efficient and gives accurate global
optimal solution for almost all the considered test problems. In a future work, we
will test the performance of our approach on other collections of test problems.
Furthermore, we will combine it with the simplex algorithm [4], DCA [16] or
with branch and bound approaches in order to find better global approximate
solutions in less computational time.
References
1. An, L.T.H., Tao, P.D.: A branch and bound method via dc optimization algorithms
and ellipsoidal technique for box constrained nonconvex quadratic problems. J.
Global Optim. 13(2), 171–206 (1998)
2. Absil, P.-A., Tits, A.L.: Newton-KKT interior-point methods for indefinite
quadratic programming. Comput. Optim. Appl. 36(1), 5–41 (2007)
3. Bentobache, M., Telli, M., Mokhtari, A.: A global minimization algorithm for con-
cave quadratic programming. In: Proceedings of the 29th European Conference on
Operational Research, EURO 2018, p. 329, University of Valencia, 08–11 July 2018
4. Bentobache, M., Telli, M., Mokhtari, A.: A simplex algorithm with the small-
est index rule for concave quadratic programming. In: Proceedings of the Eighth
International Conference on Advanced Communications and Computation, INFO-
COMP 2018, pp. 88–93, Barcelona, Spain, 22–26 July 2018
5. Chinchuluun, A., Pardalos, P.M., Enkhbat, R.: Global minimization algorithms for
concave quadratic programming problems. Optimization 54(6), 627–639 (2005)
6. CPLEX12.8, IBM Ilog. Inc., NY (2017)
7. Floudas, C.A., Pardalos, P.M., Adjiman, C., Esposito, W.R., Gumus, Z.H., Hard-
ing, S.T., Klepeis, J.L., Meyer, C.A., Schweiger, C.A.: Handbook of Test Problems
in Local and Global Optimization. Nonconvex Optimization and its Applications.
Springer, Boston (1999)
8. Globallib: Gamsworld global optimization library. http://www.gamsworld.org/
global/globallib.htm. Accessed 15 Jan 2019
9. Hiriart-Urruty, J.B., Ledyaev, Y.S.: A note on the characterization of the global
maxima of a (tangentially) convex function over a convex set. J. Convex Anal. 3,
55–62 (1996)
10. Horst, R.: An algorithm for nonconvex programming problems. Math. Program.
10, 312–321 (1976)
11. Matlab2018a. Mathworks, Inc., NY (2018)
12. Pardalos, P.M., Rodgers, G.: Computational aspects of a branch and bound algo-
rithm for quadratic zero-one programming. Computing 45(2), 131–144 (1990)
13. Rusakov, A.I.: Concave programming under simplest linear constraints. Comput.
Math. Math. Phys. 43(7), 908–917 (2003)
14. Strekalovsky, A.S.: Global optimality conditions for nonconvex optimization. J.
Global Optim. 12(4), 415–434 (1998)
15. Sung, Y.Y., Rosen, J.B.: Global minimum test problem construction. Math. Pro-
gram. 24(1), 353–355 (1982)
16. Tao, P.D., An, L.T.H.: Convex analysis approach to DC programming: theory,
algorithms and applications. Acta Math. Vietnam. 22, 289–355 (1997)
17. Telli, M., Bentobache, M.: Mokhtari, A: A Successive Linear Approximations App-
roach for the Global Minimization of a Concave Quadratic Program, Submitted to
Computational and Applied Mathematics. Springer (2019)
18. Tuy, H.: Concave programming under linear constraints. Doklady Akademii Nauk
SSSR 159, 32–35 (1964)
19. Tuy, H.: DC optimization problems. In : Convex analysis and global optimization.
Springer optimization and its applications, vol. 110, pp. 167–228, Second edn.
Springer, Cham (2016)
20. Wang, F.: A new exact algorithm for concave knapsack problems with integer
variables. Int. J. Comput. Math. 96(1), 126–134 (2019)
21. Xia, W., Vera, J., Zuluaga, L. F.: Globally solving non-convex quadratic programs
via linear integer programming techniques. arXiv preprint, arXiv:1511.02423v3
(2018)
A Survey of Surrogate Approaches
for Expensive Constrained Black-Box
Optimization
Rommel G. Regis(B)
Department of Mathematics, Saint Joseph’s University,

Philadelphia, PA 19131, USA
rregis@sju.edu
Abstract. Numerous practical optimization problems involve black-box

functions whose values come from computationally expensive simula-
tions. For these problems, one can use surrogates that approximate the
expensive objective and constraint functions. This paper presents a sur-
vey of surrogate-based or surrogate-assisted methods for computation-
ally expensive constrained global optimization problems. The methods
can be classified by type of surrogate used (e.g., kriging or radial basis
function) or by the type of infill strategy. This survey also mentions
algorithms that can be parallelized and that can handle infeasible initial
points and high-dimensional problems.
Keywords: Global optimization · Black-box optimization ·

Constraints · Surrogates · Kriging · Radial basis functions
1 Introduction
Many engineering optimization problems involve black-box objective or con-

straint functions whose values are obtained from computationally expensive
finite element (FE) or computational fluid dynamics (CFD) simulations. More-
over, for some of these problems, the calculation of the objective or constraint
functions might fail at certain inputs, indicating the presence of hidden con-
straints. For such optimization problems, a natural approach involves using
surrogate models that approximate the expensive objective or constraint func-
tions. Commonly used surrogates include kriging or Gaussian process models
and Radial Basis Function (RBF) models. In the literature, various strategies
for selecting sample points where the objective and constraint functions are
evaluated (also known as infill strategies) have been proposed, including those
for problems with expensive black-box constraints and cheap explicitly defined
constraints.

https://doi.org/10.1007/978-3-030-21803-4_4
38 R. G. Regis
This paper provides a survey of approaches for computationally expensive

constrained optimization problems of the following general form:
min f (x)
s.t. x ∈ Rd , ≤ x ≤ u
gi (x) ≤ 0, i = 1, . . . , m (1)
hj (x) = 0, j = 1, . . . , p
x ∈ X ⊆ Rd
where , u ∈ Rd , m ≥ 0, p ≥ 0, at least one of the objective or constraint

functions f, g1 , . . . , gm , h1 , . . . , hp is black-box and computationally expensive,
and X ⊆ Rd is meant to capture the region where hidden constraints are not
violated. Here, we allow for the possibility that m = 0 or p = 0 (no inequal-
ity or no equality constraints or both). Also, we allow X = Rd (no hidden
constraints). In general, hidden constraints cannot be relaxed (i.e., hard con-
straints) while the above inequality and equality constraints can be relaxed (i.e.,
soft constraints). Note that in a practical problem, there might be a mixture
of expensive black-box constraints and cheap explicitly defined constraints. It is
also possible that the objective may be cheap to evaluate and only some (or all)
of the constraints are black-box and expensive. These problems are referred to
grey-box models [7]. Most of the algorithms discussed here are meant for prob-
lems with expensive objective and inequality constraint functions (no equality
and hidden constraints).
Define the vector-valued functions G(x) := (g1 (x), . . . , gm (x)) and H(x) :=
(h1 (x), . . . , hp (x)), and let [, u] := {x ∈ Rd : ≤ x ≤ u}, and let D :=
{x ∈ [, u] ∩ X : G(x) ≤ 0, H(x) = 0} be the feasible region of (1).
Here, [, u] ⊆ Rd is the search space of problem (1), and one simulation for
a given input x ∈ [, u] ∩ X yields the values of f (x), G(x) and H(x). In the
computationally expensive setting, one wishes to find the global minimum of
(1) (or at least a feasible solution with good objective function value) given a
relatively limited number of simulations. If D = ∅, f , g1 , . . . , gm , h1 , . . . , hp are
all continuous functions on [, u] and X = Rd (or X contains the region defined by
all the constraints), then D is a compact set and f is guaranteed to have a global
minimizer in D. Now some of the black-box objective and constraint functions
may not be continuous in practice, but it is helpful to consider situations when
a global minimizer is guaranteed to exist.
This paper is organized as follows. Section 2 provides the general structure of
surrogate methods for constrained optimization. Section 3 provides two widely
used surrogates, Radial Basis Functions and Kriging. Section 4 discusses some
of the infill strategies for constrained optimization. An infill strategy is a way to
select sample points for function evaluations. Finally, Sect. 5 provides a summary
and some future directions for surrogate-based constrained optimization.
A Survey of Surrogate Approaches 39
2 General Structure of Surrogate Methods

for Constrained Optimization
Surrogate-based methods for expensive black-box optimization generally follow
the same structure. There is an initialization phase where the objective and
constraint functions are evaluated at initial points, typically from a space-filling
design such as a Latin hypercube design. After the initial sample points are
obtained, the initial surrogates are built. Here, a sample point refers to a point
in the search space (i.e., x ∈ [, u]) where the objective and constraint function
values (f (x), G(x) and H(x)) are known. Depending on the type of method, the
surrogates may be global surrogates in the sense that all sample points are used
or local surrogates in that only a subset of sample points are used.
After initialization, the method enters a sampling phase where the surro-
gates and possibly additional information from previous sample points are used
to select one or more new points where the simulations will take place. The
sampling phase is typically where various algorithms differ. Some methods solve
optimization subproblems to determine new points while others select points
from a randomly generated set of points, where the probability distribution is
chosen in a way that makes it more likely to generate feasible points with good
objective function values.
After the sampling phase, simulations are run on the chosen point(s), yielding
new data points. The surrogates are then updated with the new information
obtained and the process iterates until some termination condition is satisfied.
In practice, the method usually terminates when the computational budget, in
terms of maximum number of simulations allowed, is reached.
In the case of constrained black-box optimization, one major consideration
is finding feasible sample points to begin with. For problems where the feasible
region is relatively small in relation to the search space, it is not easy to obtain
a feasible initial point by uniform random sampling. In this situation, part of
the iterations could be devoted first to finding a good feasible point and then
the remaining iterations are used to improve on this feasible point.
3 Surrogates for Constrained Black-Box Optimization

3.1 Radial Basis Function Model
The Radial Basis Function (RBF) interpolation model described in Powell [17]
has been successfully used in various surrogate-based methods for constrained
optimization, including some that can handle high-dimensional problems with
hundreds of decision variables and many black-box inequality constraints [18–
20]. Below we describe the procedure for building this RBF model.
Let u(x) be the objective function or one of the constraint functions gi or
hj for some i or j. Given n distinct sample points (x1 , u(x1 )), . . . , (xn , u(xn )) ∈
Rd × R, we use an interpolant of the form

n
sn (x) = λi φ( x − xi ) + p(x), x ∈ Rd . (2)
i=1
40 R. G. Regis
Here, · is the Euclidean norm, λi ∈ R for i = 1, . . . , n and p(x) is a polynomial

in d variables. In some surrogate-based methods φ has the cubic form (φ(r) = r3 )
and p(x) is a linear polynomial. Other possible choices for φ include the thin plate
spline, multiquadric and Gaussian forms (see [17]).
To build the above RBF model in the case where the tail p(x) is a linear
polynomial, define the matrix Φ ∈ Rn×n by: Φij := φ( xi −xj ), i, j = 1, . . . , n.
Also, define the matrix P ∈ Rn×(d+1) whose ith row is [1, xTi ]. Now, the RBF
model that interpolates the sample points (x1 , u(x1 )), . . . , (xn , u(xn )) is obtained
by solving the linear system

Φ P λ U
= , (3)
P T 0(d+1)×(d+1) c 0d+1
where 0(d+1)×(d+1) ∈ R(d+1)×(d+1) is a matrix of zeros, U = (u(x1 ), . . . , u(xn ))T ,

0d+1 ∈ Rd+1 is a vector of zeros, λ = (λ1 , . . . , λn )T ∈ Rn and c = (c0 , c1 , . . . , cd )T
∈ Rd+1 consists of the coefficients for the linear function p(x). The coefficient
matrix in (3) is nonsingular if and only if rank(P ) = d + 1 (Powell [17]). This
condition is equivalent to having a subset of d + 1 affinely independent points
among the points {x1 , . . . , xn }.
3.2 Kriging Model
A widely used kriging surrogate model is described in Jones et al. [13] and
Jones [12] (sometimes called the DACE model) where the values of the black-
box function f are assumed to be the outcomes of a stochastic process. That
is, before f is evaluated at any point, assume that f (x) is a realization of a
Gaussian random variable Y (x) ∼ N (μ, σ 2 ). Moreover, for any two points xi
and xj , the correlation between Y (xi ) and Y (xj ) is modeled by

d
Corr[Y (xi ), Y (xj )] = exp − θ |xi − xj |
p
, (4)
=1
where θ , p ( = 1, . . . , d) are parameters to be determined. This correlation

model is only one of many types of correlation functions that can be used in
kriging metamodels. Note that when xi and xj are close, Y (xi ) and Y (xj ) will
be highly correlated according to this model. As xi and xj become farther apart,
the correlation drops to 0.
Given n points x1 , . . . , xn ∈ Rd , the uncertainty about the values of f at these
points can be modeled by using the random vector Y = (Y (x1 ), . . . , Y (xn ))T .
Note that E(Y ) = Jμ, where J is the n×1 vector of all ones, and Cov(Y ) = σ 2 R,
where R is the n × n matrix whose (i, j) entry is given by (4).
Suppose the function f has been evaluated at the points x1 , . . . , xn ∈ Rd .
Let y1 = f (x1 ), . . . , yn = f (xn ) and let y = (y1 , . . . , yn )T be the vector of
observed function values. Fitting the kriging model in [12] through the data
points (x1 , y1 ), . . . , (xn , yn ) involves finding the maximum likelihood estimates
(MLEs) of the parameters μ, σ 2 , θ1 , . . . , θd , p1 , . . . , pd . The MLEs of these param-

eters are typically obtained by solving a numerical optimization problem. Now
the value of the kriging predictor at a new point x∗ is provided by the formula [12]
y(x∗ ) = μ
+ rT R−1 (y − J μ
), (5)
J T R−1 y
and r = (Corr[Y (x∗ ), Y (x1 )], . . . , Corr[Y (x∗ ), Y (xn )]) .
T
= T −1
where μ
J R J
Moreover, a measure of error of the kriging predictor at x∗ is given by

T −1 (1 − rT R−1 r)2
2
1−r R r+
s (x) = σ 2
, (6)
J T R−1 J
2 =
where σ 1
n (y )T R−1 (y − J μ
− Jμ ).
4 Infill Strategies for Constrained Optimization

4.1 Radial Basis Function Methods
One effective infill strategy for problems with expensive black-box objective and
inequality constraints (no equality constraints and no hidden constraints) is
provided by the COBRA algorithm [19]. COBRA uses the above RBF model
to approximate the objective and constraint functions though one can use
other types of surrogates with its infill strategy. It treats each inequality con-
straint individually instead of combining them into a penalty function and
builds/updates RBF surrogates for the objective and constraints in each iter-
ation. Moreover, it handles infeasible initial sample points using a two-phase
approach where Phase I finds a feasible point while Phase II improves on this
feasible point. In Phase I, the next iterate is a minimizer of the sum of the
squares of the predicted constraint violations (as predicted by the RBF surro-
gates) subject only to the bound constraints. In Phase II, the next iterate is a
minimizer of the RBF surrogate of the objective subject to RBF surrogates of
the inequality constraints within some small margin and also satisfying a dis-
tance requirement from previous iterates. That is, the next iterate xn+1 solves
the optimization subproblem:
(0)
minx sn (x)
s.t. x ∈ Rd , ≤ x ≤ u
(i) (i) (7)
sn (x) + n ≤ 0, i = 1, 2, . . . , m
x − xj ≥ ρn , j = 1, . . . , n
(0) (i)
Here, sn (x) is the RBF model of f (x) while sn (x) is the RBF model of gi (x)
(i)
for i = 1, . . . , m. Moreover, n > 0 is the margin for the ith constraint and ρn is
the distance requirement given the first n sample points. The margins are meant
to facilitate the generation of feasible iterates. The ρn ’s are allowed to cycle
from large values meant to enforce global search and small values that promote
42 R. G. Regis
local search. In the original implementation, the optimization subproblem in

(7) is solved using Matlab’s gradient-based fmincon solver from a good starting
point obtained by a global search scheme, but one can also combine this with a
multistart approach. COBRA performed well compared to alternatives on 20 test
problems and on the large-scale 124-D MOPTA08 benchmark with 68 black-box
inequality constraints [11].
One issue with COBRA [19] observed by Koch et al. [14] is that sometimes
the solution returned by the solver for the subproblem is infeasible. Hence, they
developed a variant called COBRA-R [14] that incorporates a repair mechanism
that guides slightly infeasible points to the feasible region (with respect to the
RBF constraints). Moreover, another issue with COBRA is that its performance
can be sensitive to the choice of the distance requirement cycle (DRC) that
specifies the ρn ’s in (7). To address this, Bagheri et al. [3] developed SACOBRA
(Self-Adjusting COBRA), which includes an automatic DRC adjustment and
selects appropriate ρn values based on the information obtained after initializa-
tion. In addition, SACOBRA re-scales the search space to [−1, 1]d , performs a
logarithmic transformation on the objective function, if necessary, and normal-
izes the constraint function values. Numerical experiments in [3] showed that
SACOBRA outperforms COBRA with different fixed parameter settings.
An alternative to COBRA [19] is the ConstrLMSRBF algorithm [18], which
also uses RBF models of the objective and constraint functions though one can
also use other types of surrogates. ConstrLMSRBF is a heuristic that selects
sample points from a set of randomly generated candidate points, typically from
a Gaussian distribution centered at the current best point. In each iteration,
the sample point is chosen to be the best candidate point according to two
criteria (predicted objective function value and minimum distance from previous
sample points) from among the candidate points with the minimum number of
predicted constraint violations. When it was first introduced at ISMP 2009,
ConstrLMSRBF was the best known algorithm for the MOPTA08 problem [11].
The original ConstrLMSRBF [18] assumes that there is a feasible point
among the initial points. Extended ConstrLMSRBF [19] was developed to
deal with infeasible initial points by following a similar two-phase structure as
COBRA [19] where Phase I searches for a feasible point while Phase II improves
the feasible point found. In Phase I of Extended ConstrLMSRBF, the next sam-
ple point is the one with the minimum number of predicted constraint violations
among the candidate points, with ties being broken by using the maximum pre-
dicted constraint violation as an additional criterion.
Another way to generate infill points for constrained optimization using RBF
surrogates is to use the CONORBIT trust region approach [23], which is an
extension of the ORBIT algorithm [27]. CONORBIT uses only a subset of pre-
vious sample points that are close to current trust region center to build RBF
models for the objective and constraint functions. In a typical iteration, the
next sample point is obtained by minimizing the RBF model of the objective
subject to RBF models of the constraints within the current trust region. As
with COBRA [19], it uses a small margin for the RBF constraints.
4.2 Kriging-Based Methods

The most popular kriging-based infill strategy is the expected improvement crite-
rion [25] that forms the basis of the original Efficient Global Optimization (EGO)
method [9,13] for bound-constrained problems. Here, we use the notation from
Sect. 3.2. In this strategy, the next sample point is the point x that maximizes
the expected improvement function EI(x) over the search space where

fmin − y(x) fmin − y(x)
EI(x) = (fmin − y(x))Φ + s(x)φ , (8)
s(x) s(x)
if s(x) > 0 and EI(x) = 0 if s(x) = 0. Here, Φ and φ are the cdf and pdf
of the standard normal distribution, respectively. Also, fmin is the current best
objective function value. Extensions and modifications to the EI criterion include
generalized expected improvement [25] and weighted expected improvement [26].
Moreover, alternatives to EI are given in [24], including the WB2 (locating the
regional extreme) criterion, which maximizes WB2(x) = − y (x) + EI(x). This
attempts to minimize the kriging surrogate while also maximizing EI.
When the problem has black-box inequality constraints gi (x) ≤ 0, i =
1, . . . , m, we fit a kriging surrogate gi (x) for each gi (x). That is, for each i and
a given x, assume that gi (x) is the realization of a Gaussian random variable
Gi (x) ∼ N (μgi , σg2i ) where the parameters of this distribution are estimated by
maximum likelihood as in Sect. 3.2. A standard way to handle these inequality
constraints is to find the sample point x that maximizes a penalized expected
improvement function obtained by multiplying the EI with the probability that
x will be feasible (assuming the Gi (x)’s are independent) [25]:

m
m
− μgi
EIp (x) = EI(x) P (Gi (x) ≤ 0) = EI(x) Φ , (9)
i=1 i=1
σgi
where μ gi and σ

g2i are the MLEs of the parameters of the random variable Gi (x)
and where fmin for the EI in (8) is the objective function value of the current
best feasible solution (or the point closest to being feasible if no feasible points
are available yet) [16].
Sasena et al. [24], Parr et al. [16], Basudhar et al. [4] and Bagheri et al. [2]
presented extensions of EGO for constrained optimization. Moreover, Bouhlel
et al. [6] developed SEGOKPLS+K, which is an extension of SuperEGO [24]
for constrained high-dimensional problems by using the KPLS+K (Kriging with
Partial Least Squares) model [5]. SEGOKPLS+K uses the WB2 (locating the
regional extreme) criterion described above where the surrogate is minimized
while also maximizing the EI criterion. Moreover, it replaces the kriging mod-
els by the KPLS(+K) models, which are more suitable for high-dimensional
problems.
4.3 Surrogate-Assisted Methods for Constrained Optimization

An alternative approach for constrained expensive black-box optimization is
to use surrogates to accelerate or enhance an existing method, typically a
44 R. G. Regis
metaheuristic. We refer to these as surrogate-assisted methods. For example,

CEP-RBF [20] is a Constrained Evolutionary Programming (EP) algorithm that
is assisted by RBF surrogates. Recall that in a standard Constrained (μ + μ)-
EP, each parent generates one offspring using only mutations (typically from
a Gaussian or Cauchy distribution) and does not perform any recombination.
Moreover, the offspring are compared using standard rules such as: between two
feasible solutions, the one with the better objective function value wins; or a
between a feasible solution and an infeasible solution, the feasible solution wins;
and between two infeasible solutions, the one with the smaller constraint viola-
tion (according to some metric) wins. In each generation of CEP-RBF, a large
number of trial offspring are generated by each parent. Then, RBF surrogates are
used to identify the most promising among these trial offspring for each parent,
and this becomes the sample point where the simulation will take place. Here,
a promising trial offspring is the one with the best predicted objective func-
tion value from among those with the minimum number of predicted constraint
violations. Once the simulation is performed and the objective and constraint
function values are known, the selection of the new parent population proceeds
as in a regular Constrained EP and the process iterates.
Another surrogate-assisted metaheuristic is the CONOPUS (CONstrained
Optimization by Particle swarm Using Surrogates) framework [21]. In each iter-
ation of CONOPUS, multiple trial positions for each particle in the swarm are
generated, and surrogates for the objective and constraint functions are used to
identify the most promising trial position where the simulations are performed.
Moreover, it includes a refinement step where the current overall best position
is replaced by the minimum of the surrogate of the objective within a neigh-
borhood of that position and subject to surrogate inequality constraints with a
small margin and with a distance requirement from all previous sample points.
In addition, one can also use surrogates to assist provably convergent algo-
rithms. For example, quadratic models have been used in the direct search
method NOMAD [8]. Moreover, CARS-RBF [15] is an RBF-assisted version
of Constrained Accelerated Random Search (CARS), which extends the Accel-
erated Random Search (ARS) algorithm [1] to constrained problems. In each
iteration of CARS, a sample point is chosen uniformly within a box centered at
the current best point. The initial size of the box is chosen so that it covers the
search space. If the sample point is worse than the current best point, then the
size of the box is reduced. Otherwise, if the sample point is an improvement over
the current best point, then the size of the box is reset to the initial value so that
the box again covers the search space. CARS [15] has been shown to converge
to the global minimum almost surely. Further, it was shown numerically to con-
verge faster than the constrained version of Pure Random Search on many test
problems. In CARS-RBF, a large number of trial points is generated uniformly
at random within the current box, and as before, RBF surrogates are used to
identify the most promising among these trial points using the same criteria used
by ConstrLMSRBF [18]. The simulations are then carried out at this promising
trial point and the algorithm proceeds in the same manner as CARS.
4.4 Parallelization and Handling High Dimensions

To make it easier to find good solutions for computationally expensive problems,
one can generate multiple sample points that can be evaluated in parallel in
each iteration. Metaheuristics such as evolutionary and swarm algorithms are
naturally parallel, and so, the surrogate-assisted CEP-RBF [20] and CONOPUS
[21] algorithms are easy to parallelize. COBRA [19] and ConstrLMSRBF [18]
can be parallelized using ideas in [22]. Moreover, parallel EGO approaches are
described in [10,28] and these can be extended to constrained problems.
For high-dimensional problems, RBF methods have been shown to be effec-
tive (e.g., ConstrLMSRBF [18], COBRA [19], CEP-RBF [20] and CONOPUS-
RBF [21]). The standard constrained EGO, however, has difficulties with in high
dimensions because of the computational overhead and numerical issues with fit-
ting the kriging model. To alleviate these issues, Bouhlel et al. [6] introduced
SEGOKPLS+K, which can handle problems with about 50 decision variables.
5 Summary and Future Directions

This paper gave a brief survey of some of the surrogate-based and surrogate-
assisted methods for constrained optimization. The methods discussed are based
on RBF and kriging models, though other types of surrogates and ensembles may
be used. Various infill strategies were discussed. Moreover, parallel surrogate-
based methods and algorithms that can handle high-dimensional problems were
mentioned. Possible future directions of research for constrained expensive black-
box optimization would be to develop methods that can handle black-box equal-
ity constraints. Relatively few such methods have been developed and one app-
roach is described in [3]. Another direction is to deal with hidden constraints.
Finally, it is important to develop more methods that can be proved to converge
to the global minimum, or at least to first order points.
References
1. Appel, M.J., LaBarre, R., Radulović, D.: On accelerated random search. SIAM J.
Optim. 14(3), 708–731 (2004)
2. Bagheri, S., Konen, W., Allmendinger, R., Branke, J., Deb, K., Fieldsend, J.,
Quagliarella, D., Sindhya, K.: Constraint handling in efficient global optimization.
In: Proceedings of the Genetic and Evolutionary Computation Conference, pp.
673–680. GECCO 2017, ACM, New York (2017)
3. Bagheri, S., Konen, W., Emmerich, M., Bäck, T.: Self-adjusting parameter control
for surrogate-assisted constrained optimization under limited budgets. Appl. Soft
Comput. 61, 377–393 (2017)
4. Basudhar, A., Dribusch, C., Lacaze, S., Missoum, S.: Constrained efficient global
optimization with support vector machines. Struct. Multidiscip. Optim. 46(2),
201–221 (2012)
5. Bouhlel, M.A., Bartoli, N., Otsmane, A., Morlier, J.: Improving kriging surrogates
of high-dimensional design models by partial least squares dimension reduction.
Struct. Multidiscip. Optim. 53(5), 935–952 (2016)
46 R. G. Regis
6. Bouhlel, M.A., Bartoli, N., Regis, R.G., Otsmane, A., Morlier, J.: Efficient global
optimization for high-dimensional constrained problems by using the kriging mod-
els combined with the partial least squares method. Eng. Optim. 50(12), 2038–2053
(2018)
7. Boukouvala, F., Hasan, M.M.F., Floudas, C.A.: Global optimization of general
constrained grey-box models: new method and its application to constrained PDEs
for pressure swing adsorption. J. Global Optim. 67(1), 3–42 (2017)
8. Conn, A.R., Le Digabel, S.: Use of quadratic models with mesh-adaptive direct
search for constrained black box optimization. Optim. Methods Softw. 28(1), 139–
158 (2013)
9. Forrester, A.I.J., Sobester, A., Keane, A.J.: Engineering Design Via Surrogate
Modelling: A Practical Guide. Wiley (2008)
10. Ginsbourger, D., Le Riche, R., Carraro, L.: Kriging Is Well-Suited to Parallelize
Optimization, pp. 131–162. Springer, Heidelberg (2010)
11. Jones, D.R.: Large-scale multi-disciplinary mass optimization in the auto industry.
In: MOPTA 2008, Modeling and Optimization: Theory and Applications Confer-
ence. MOPTA, Ontario, Canada, August 2008
12. Jones, D.R.: A taxonomy of global optimization methods based on response sur-
faces. J. Global Optim. 21(4), 345–383 (2001)
13. Jones, D., Schonlau, M., Welch, W.: Efficient global optimization of expensive
black-box functions. J. Global Optim. 13(4), 455–492 (1998)
14. Koch, P., Bagheri, S., Konen, W., Foussette, C., Krause, P., Bäck, T.: A new
repair method for constrained optimization. In: Proceedings of the Genetic and
Evolutionary Computation Conference (GECCO 2015), pp. 273–280 (2015)
15. Nuñez, L., Regis, R.G., Varela, K.: Accelerated random search for constrained
global optimization assisted by radial basis function surrogates. J. Comput. Appl.
Math. 340, 276–295 (2018)
16. Parr, J.M., Keane, A.J., Forrester, A.I., Holden, C.M.: Infill sampling criteria for
surrogate-based optimization with constraint handling. Eng. Optim. 44(10), 1147–
1166 (2012)
17. Powell, M.J.D.: The theory of radial basis function approximation in 1990. In:
Light, W. (ed.) Advances in Numerical Analysis, Volume 2: Wavelets, Subdivision
Algorithms and Radial Basis Functions, pp. 105–210. Oxford University Press,
Oxford (1992)
18. Regis, R.G.: Stochastic radial basis function algorithms for large-scale optimization
involving expensive black-box objective and constraint functions. Comput. Oper.
Res. 38(5), 837–853 (2011)
19. Regis, R.G.: Constrained optimization by radial basis function interpolation for
high-dimensional expensive black-box problems with infeasible initial points. Eng.
Optim. 46(2), 218–243 (2014)
20. Regis, R.G.: Evolutionary programming for high-dimensional constrained expen-
sive black-box optimization using radial basis functions. IEEE Trans. Evol. Com-
put. 18(3), 326–347 (2014)
21. Regis, R.G.: Surrogate-assisted particle swarm with local search for expensive con-
strained optimization. In: Korošec, P., Melab, N., Talbi, E.G. (eds.) Bioinspired
Optimization Methods and Their Applications, pp. 246–257. Springer International
Publishing, Cham (2018)
22. Regis, R.G., Shoemaker, C.A.: Parallel radial basis function methods for the global
optimization of expensive functions. Eur. J. Oper. Res. 182(2), 514–535 (2007)
23. Regis, R.G., Wild, S.M.: CONORBIT: constrained optimization by radial basis
function interpolation in trust regions. Optim. Methods Softw. 32(3), 552–580
(2017)
24. Sasena, M.J., Papalambros, P., Goovaerts, P.: Exploration of metamodeling sam-
pling criteria for constrained global optimization. Eng. Optim. 34(3), 263–278
(2002)
25. Schonlau, M.: Computer Experiments and Global Optimization. Ph.D. thesis, Uni-
versity of Waterloo, Canada (1997)
26. Sóbester, A., Leary, S.J., Keane, A.J.: On the design of optimization strategies
based on global response surface approximation models. J. Global Optim. 33(1),
31–59 (2005)
27. Wild, S.M., Regis, R.G., Shoemaker, C.A.: ORBIT: optimization by radial basis
function interpolation in trust-regions. SIAM J. Sci. Comput. 30(6), 3197–3219
(2008)
28. Zhan, D., Qian, J., Cheng, Y.: Pseudo expected improvement criterion for parallel
EGO algorithm. J. Global Optim. 68(3), 641–662 (2017)
Adaptive Global Optimization Based on
Nested Dimensionality Reduction
Konstantin Barkalov(B) and Ilya Lebedev
Lobachevsky State University of Nizhni Novgorod, Nizhni Novgorod, Russia

konstantin.barkalov@itmm.unn.ru
Abstract. In the present paper, the multidimensional multiextremal

optimization problems and the numerical methods for solving these ones
are considered. A general assumption only is made on the objective func-
tion that this one satisfies the Lipschitz condition with the Lipschitz
constant not known a priori. The problems of this type are frequent in
the applications. Two approaches to the dimensionality reduction for
the multidimensional optimization problems were considered. The first
one uses the Peano-type space-filling curves mapping a one-dimensional
interval onto a multidimensional domain. The second one is based on the
nested optimization scheme, which reduces a multi-dimensional problem
to a family of the one-dimensional subproblems. A generalized scheme
combining these two approaches has been proposed. In this novel scheme,
solving a multidimensional problem is reduced to solving a family of prob-
lems of lower dimensionality, in which the space-filling curves are used.
An adaptive algorithm, in which all arising subproblems are solved simul-
taneously has been implemented. The numerical experiments on several
hundred test problems have been carried out confirming the efficiency of
the proposed generalized scheme.
Keywords: Global optimization · Multiextremal functions ·

Dimensionality reduction · Peano curve · Nested optimization ·
Numerical methods
1 Introduction
This paper considers “black-box” global optimization problems of the following
form:
ϕ(y ∗ ) = min {ϕ(y) : y ∈ D}, (1)

D = y ∈ RN : ai ≤ yi ≤ bi , 1 ≤ i ≤ N .
This study was supported by the Russian Science Foundation, project No. 16-11-10150.

https://doi.org/10.1007/978-3-030-21803-4_5
Adaptive Global Optimization 49
The objective function is assumed to satisfy the Lipschitz condition
|ϕ(y ) − ϕ(y )| ≤ L y − y , y , y ∈ D, 0 < L < ∞,
with the constant L unknown a priori.

The multistart scheme is a well known method for solving the multiextremal
problems. In such schemes, a grid is seeded in the search domain, the points of
which are used as the starting ones for the search of the extrema by some local
method, and then the lowest of the found extrema is chosen. The choice of the
starting points performed, as a rule, on the basis of the Monte Carlo method
is a special problem within this approach [25]. The approach is well applicable
for the problems with a small number of local minima having a vide field of
application, but for the problems with essential multiextremality its efficiency
falls drastically.
At present, the genetic algorithms one way or another based on the random
search concept are used widely for solving the global optimization problems (see,
for example, [24]). Because of simplicity of implementation and usage these ones
have gained a large popularity. However the quality of work (the number of
problems from some set solved correctly can serve as a quantitative measure of
which) is essentially less as compared to the deterministic algorithms [14,20].
If one speaks about the deterministic global optimization methods, many
ones of this class are based on various methods of division of the search domain
into a system of subdomains followed by the selection of the most promising
subdomain for placing the next trial (computing the objective function). The
results in this direction are presented in [5,13,15,16,19].
Finally, the approach related to the reduction of the multidimensional
problems to the equivalent one-dimensional ones (or to a system of the one-
dimensional subproblems) with subsequent solving the one-dimensional prob-
lems by the efficient univariate optimization algorithms is widely used for the
development of the multidimensional optimization methods. Two such schemes
are used: the reduction based on the Peano-type space-filling curves (evolvents)
[22,23], and the nested optimization scheme [18,23].
An adaptive reduction scheme generalizing the classical nested optimization
one has been proposed in [8]. The adaptive scheme essentially enhances the
optimization efficiency as compared to the base prototype [10]. In the present
work, a generalization of the adaptive dimensionality reduction scheme combin-
ing the use of the nested optimization and the Peano curves has been proposed.
In this approach the nested subproblems in the adaptive scheme can be the one-
dimensional as well as the multidimensional ones. In the latter case, the evolvents
are used for the reduction of the dimensionality of the nested subproblems.
2 The Global Search Algorithm
As a core problem we consider a one-dimensional multiextremal optimization

problem
ϕ∗ = ϕ(x∗ ) = min {ϕ(x) : x ∈ [0, 1]} (2)
50 K. Barkalov and I. Lebedev
with objective function satisfying the Lipschitz condition.

Let us give the description of the global search algorithm (GSA) applied
for solving the above problem (according to [23]). GSA involves constructing a
sequence of points xi , where the values of the objective function z i = ϕ(xi ) are
calculated. Let us call the function value calculation process the trial. According
to the algorithm, the first two trials are executed at the ends of the interval
[0, 1], i.e. x0 = 0, x1 = 1. The function values z 0 = ϕ(x0 ), z 1 = ϕ(x1 ) are
computed and the number k is set to 1. In order to select the point of a new
trial xk+1 , k ≥ 1, it is necessary to perform the following steps.
Step 1. Renumber by subscripts (beginning from zero) the points xi , 0 ≤ i ≤ k,
of the previous trials in increasing order, i.e.,
0 = x0 < x1 < . . . < xk = 1.
Juxtapose to the points xi , 0 ≤ i ≤ k, the function values zi = ϕ(xi ), 0 ≤ i ≤ k.

Step 2. Compute the maximum absolute value of the first divided differences
|zi − zi−1 |
μ = max (3)
1≤i≤k Δi
where Δi = xi − xi−1 . If the above formula yields a zero value, assume that
μ = 1.
Step 3. For each interval (xi−1 , xi ), 1 ≤ i ≤ k, calculate the characteristic
(zi − zi−1 )2
R(i) = rμΔi + − 2(zi + zi−1 ), (4)
rμΔi
where r > 1 is a predefined parameter of the method.
Step 4. Find the interval (xt−1 , xt ) with the maximum characteristic
R(t) = max R(i). (5)

1≤i≤k
Step 5. Execute the new trial at the point

1 zt − zt−1
xk+1 = (xt−1 + xt ) − . (6)
2 2rμ
The algorithm terminates if the condition Δt < is satisfied; here t is from
(5) and > 0 is the preset accuracy. For estimation of the global solution values
zk∗ = min ϕ(xi ), x∗k = arg min ϕ(xi ).

0≤i≤k 0≤i≤k
are selected. The theory of algorithm convergence is presented in [23].

3 Dimensionality Reduction
3.1 Dimensionality Reduction Using Peano-Type Space-Filling
Curves
The use of Peano curve y(x)

y ∈ RN : −2−1 ≤ yi ≤ 2−1 , 1 ≤ i ≤ N = {y(x) : 0 ≤ x ≤ 1} (7)
unambiguously mapping the interval of real axis [0, 1] onto an N -dimensional

cube is the first of the dimension reduction methods considered. Problems of
numerical construction of Peano-type space-filling curves and the corresponding
theory are considered in detail in [22,23]. Here we will note that a numerically
constructed curve (evolvent) is 2−m accurate approximation of the theoretical
Peano curve, where m is an evolvent construction parameter.
By using this kind of mapping it is possible to reduce the multidimensional
problem (1) to a univariate problem
ϕ(y ∗ ) = ϕ(y(x∗ )) = min {ϕ(y(x)) : x ∈ [0, 1]}. (8)
An important property of such mapping is preservation of boundedness of func-

tion relative differences (see [22,23]). If the function ϕ(y) in the domain D satis-
fies the Lipschitz condition, then the function ϕ(y(x)) on the interval [0, 1] will
satisfy a uniform Hölder condition
1/N
|ϕ(y(x1 )) − ϕ(y(x2 ))| ≤ H |x1 − x2 | , (9)
where the Hölder constant H is linked to the Lipschitz constant L by the relation
√
H = 2L N + 3. (10)
Condition (9) allows adopting the algorithm for solving the one-dimensional
problems presented in Sect. 2 for solving the multidimensional problems reduced
to the one-dimensional ones. For this, the lengths of intervals Δi involved into
rules (3), (4) of the algorithm are substituted by the lengths
1/N
Δi = (xi − xi−1 ) (11)
and the following expression is introduced instead of formula (6):

N
xt + xt−1 1 |zt − zt−1 |
xk+1 = − sign(zt − zt−1 ) . (12)
2 2r μ
3.2 Nested Optimization Scheme

The nested optimization scheme of dimensionality reduction is based on the
well-known relation (see, e.g., [4])
min ϕ(y) = min min ... min ϕ(y), (13)

y∈D a1 ≤y1 ≤b1 a2 ≤y2 ≤b2 aN ≤yN ≤bN
which allows replacing the solving of multidimensional problem (1) by solving a

family of one-dimensional subproblems related to each other recursively.
In order to describe the scheme let us introduce a set of reduced functions
as follows:
ϕN (y1 , ..., yN ) = ϕ(y1 , ..., yN ), (14)
ϕi (y1 , ..., yi ) = min ϕi+1 (y1 , ..., yi , yi+1 ), 1 ≤ i ≤ N − 1. (15)
ai+1 ≤yi+1 ≤bi+1
Then, according to relation (13), solving of multidimensional problem (1) is

reduced to solving a one-dimensional problem
ϕ∗ = min ϕ1 (y1 ). (16)

a1 ≤y1 ≤b1
But in order to evaluate the function ϕ1 at a fixed point y1 it is necessary to

solve the one-dimensional problem of the second level
ϕ1 (y1 ) = min ϕ2 (y1 , y2 ), (17)

a2 ≤y2 ≤b2
and so on up to the univariate minimization of the function ϕN (y1 , ..., yN ) with

fixed coordinates y1 , ..., yN −1 at the N -th level of recursion.
For the nested scheme presented above, a generalization (block nested opti-
mization scheme), which combines the use of evolvents and the nested scheme
has been elaborated in [2].
Let us consider vector y as a vector of block variables
y = (y1 , y2 , ..., yN ) = (u1 , u2 , ..., uM ), (18)
where the i-th block variable ui is a vector of vector y components, taken

serially, i.e., u1 = (y1 , y2 , ..., yN1 ), u2 = (yN1 +1 , yN1 +2 , ..., yN1 +N2 ),..., uM =
(yN −NM +1 , yN −NM +2 , ..., yN ), where N1 + N2 + ... + NM = N .
Using the new variables, main relation of the nested scheme (13) can be
rewritten in the form
min ϕ(y) = min min ... min ϕ(y), (19)

y∈D u1 ∈D1 u2 ∈D2 uM ∈DM
where the subdomains Di , 1 ≤ i ≤ M , are projections of initial search domain

D onto the subspaces corresponding to the variables ui , 1 ≤ i ≤ M .
The formulae defining the method for solving the problem (1) based on rela-
tion (19), in general, are the same to the ones of nested scheme (14)–(16). It is
only necessary to substitute the original variables yi , 1 ≤ i ≤ N , by the block
variables ui , 1 ≤ i ≤ M .
At that, the nested subproblems
ϕi (u1 , ..., ui ) = min ϕi+1 (u1 , ..., ui , ui+1 ), 1 ≤ i ≤ M − 1, (20)

ui+1 ∈Di+1
in the block scheme are the multidimensional ones. The dimension reduction
method based on Peano curves can be applied for solving these ones. It is a
principal difference from the initial nested scheme.
3.3 Block Adaptive Optimization Scheme
The solving of the arising set of subproblems (15) (for the nested optimization
scheme) or (20) (for the block nested optimization scheme) can be organized in
various ways. A straightforward way (developed in details for the nested opti-
mization scheme [9,18] and for the block nested optimization scheme [2,3]) is
based on solving the subproblems according to the generation order. However,
here a loss of a considerable part of the information on the objective function
takes place when solving the multidimensional problem. Another approach is
the adaptive scheme, in which all subproblems are solved simultaneously, that
allows more complete accounting for the information on the multidimensional
and accelerating the process of its solving.
For the case of the one-dimensional subproblems the adaptive scheme was
theoretically substantiated and tested in [8,10,11]. The present work proposes
a generalization of the adaptive scheme for the case of the multidimensional
subproblems. Let us give a brief description of its basic elements.
Let us assume the nested subproblems (20) to be solved with the use of a
multidimensional global search algorithm described in Sect. 3.1. Then, each sub-
problem (20) can be associated with a numerical value called the characteristic
of this problem. The value R(t) from (5) (i.e., the maximum characteristic of
the intervals formed within the subproblem) can be taken as such characteris-
tic. According to the rule of computing the characteristics (4), the higher the
value of the characteristic, the more promising the subproblem for continuing
the search of the global minimum of the initial problem (1). Therefore, the sub-
problem with the highest characteristic is selected for executing the next trial
at each iteration. This trial either computes the objective function value ϕ(y)
(if the selected subproblem belongs to the level j = M ) or generates new sub-
problems according to (20) when j ≤ M − 1. In the latter case, new generated
problems are added to current problem set, their characteristics are computed,
and the process is repeated. The optimization process is finished when the stop
condition is fulfilled for the algorithm solving the root problem.
4 Results of Numerical Experiments

One of the well-known approaches to investigating and comparing multiextremal
optimization algorithms is based on the application of these methods for solving
a set of test problems generated randomly. The comparison of the algorithms
has been carried out using the Grishagin test problems Fgr (see, for example,
[17], test function 4) and the GKLS generator [7].
In [1,11] the global search algorithm (GSA) with the use of the evolvents as
well as in combination with the adaptive dimensionality reduction scheme was
shown to overcome many well known optimization algorithms including DIRECT
[12] and DIRECTl [6]. Therefore, in the present study we will limit ourselves to
the comparison of the variants of GSA with different dimensionality reduction
schemes.
In order to compare the efficiencies of the algorithms, the two criteria were
used: the average number of trials and the operating characteristic. The oper-
ating characteristic of an algorithm is the function P (k) defined as the fraction
of the problems from the considered series, for solving of which not more than
k trials have been required. The problem was considered to be solved, if the
algorithm
k generated
a trial point y k in the vicinity of the global minimizer y ∗ ,
i.e. y − y < δ b − a, where δ = 0.01, a and b are the boundaries of the
∗
search domain D.
The first series of experiments has been carried out on the two-dimensional
problems from the classes Fgr , GKLS Simple, and GKLS Hard (100 functions
from each class). Table 1 presents the averaged numbers of trials executed by
GSA with the use of evolvents (Ke ), nested optimization scheme (Kn ), and adap-
tive nested optimization scheme (Kan ). Figures 1 and 2(a, b) present the operat-
ing characteristics of the algorithms obtained on the problem classes Fgr , GKLS
Simple, and GKLS Hard respectively. The solid line corresponds to the algo-
rithm using the evolvents (GSA-E), short dashed line – to the adaptive nested
optimization scheme (GSA-AN), and the long dashed line – to the nested opti-
mization scheme (GSA-N). The results of experiments demonstrate that GSA
with the use of the adaptive nested optimization scheme shows almost the same
speed as compared to GSA with the evolvents, and both algorithms considerably
exceed the algorithm using the nested optimization scheme. Therefore further
experiments were limited to the comparison of different variants of the adaptive
dimensionality reduction scheme.
Table 1. Average number of trials for 2D problems.
Fgr GKLS Simple GKLS Hard

Ke 180 252 674
Kn 341 697 1252
Kan 215 279 815
0.8
0.6
GSA-E
GSA-N
GSA-AN
0.4
0.2
0
0 100 200 300 400 500 600 700 800
Fig. 1. Operating characteristics using Fgr class

1 1
0.8 0.8
0.6 0.6
GSA-E GSA-E
GSA-N GSA-N
GSA-AN GSA-AN
0.4 0.4
0.2 0.2
0 0
0 200 400 600 800 1000 1200 1400 0 500 1000 1500 2000 2500 3000
(a) (b)
Fig. 2. Operating characteristics using 2d GKLS Simple (a) and Hard (b) classes
The second series of experiments has been carried out on the four-dimensional
problems from the classes GKLS Simple and GKLS Hard (100 functions of each
class). Table 2 presents the averaged numbers of trials executed by GSA with the
use of the adaptive nested optimization scheme (Kan ) and block adaptive nested
optimization scheme (Kban ) with two levels of subproblems of equal dimension-
ality N1 = N2 = 2. Note that when solving the problem of the dimensional-
ity N = 4 using the initial variant of the adaptive scheme, four levels of one-
dimensional subproblems are formed that complicates the processing of these
ones.
Figure 3(a, b) presents the operating characteristics of the algorithms
obtained on the classes GKLS Simple and GKLS Hard respectively. The dashed
line corresponds to GSA using the adaptive nested optimization scheme (GSA-
AN), the solid line – the block adaptive nested optimization scheme (GSA-BAN).
The results of experiments demonstrate the use of the block adaptive nested opti-
mization scheme provides a considerable gain in the number of trials (up to 35%)
as compared to the initial adaptive nested optimization scheme.
Table 2. Average number of trials for 4D problems.
GKLS Simple GKLS Hard

Kan 21747 35633
Kban 13894 31620
1 1
0.8 0.8
0.6 0.6
GSA-AN GSA-AN
GSA-BAN GSA-BAN
0.4 0.4
0.2 0.2
0 0
0 20000 40000 60000 80000 100000 0 40000 80000 120000 160000 200000
(a) (b)
Fig. 3. Operating characteristics using 4d GKLS Simple (a) and Hard (b) classes
5 Conclusion
In the present work, the generalized adaptive dimensionality reduction scheme

for the global optimization problems combining the use of Peano space-filling
curves and the nested (recursive) optimization scheme has been proposed. For
solving the reduced subproblems of less dimensionality, the global search algo-
rithm was applied. The computational scheme of the algorithm was given, main
issues related to the use of the adaptive dimensionality reduction scheme were
considered. The numerical experiments have been carried out using the series
of test problems in order to compare the efficiencies of different dimensionality
reduction schemes. The result of experiments demonstrated that the use of the
block adaptive nested optimization scheme can essentially reduce the number of
trials required to solve the problem with given accuracy. Further works on the
development of the global optimization methods based on this method of the
dimensionality reduction may be related to the use of the local estimates of the
Lipschitz constant (considered, for example, in [9,21]) in different subproblems.
References
1. Barkalov, K., Gergel, V., Lebedev, I.: Use of Xeon Phi coprocessor for solving
global optimization problems. Lecture Notes in Computer Science, vol. 9251, pp.
307–318 (2015)
2. Barkalov, K., Lebedev, I.: Solving multidimensional global optimization problems
using graphics accelerators. Commun. Comput. Inf. Sci. 687, 224–235 (2016)
3. Barkalov, K., Gergel, V.: Multilevel scheme of dimensionality reduction for parallel
global search algorithms. In: OPT-i 2014 Proceedings of 1st International Confer-
ence on Engineering and Applied Sciences Optimization, pp. 2111–2124 (2014)
4. Carr, C., Howe, C.: Quantitative Decision Procedures in Management and Eco-
nomic: Deterministic Theory and Applications. McGraw-Hill, New York (1964)
5. Evtushenko, Y., Posypkin, M.: A deterministic approach to global box-constrained
optimization. Optim. Lett. 7, 819–829 (2013)
6. Gablonsky, J.M., Kelley, C.T.: A locally-biased form of the direct algorithm. J.

Glob. Optim. 21(1), 27–37 (2001)
7. Gaviano, M., Kvasov, D.E., Lera, D., Sergeev, Ya.D.: Software for generation of
classes of test functions with known local and global minima for global optimiza-
tion. ACM Transact. Math. Softw. 29(4), 469–480 (2003)
8. Gergel, V., Grishagin, V., Gergel, A.: Adaptive nested optimization scheme for
multidimensional global search. J. Glob. Optim. 66(1), 35–51 (2016)
9. Gergel, V., Grishagin, V., Israfilov, R.: Local tuning in nested scheme of global
optimization. Proc. Comput. Sci. 51(1), 865–874 (2015)
10. Grishagin, V., Israfilov, R., Sergeyev, Y.: Comparative efficiency of dimensionality
reduction schemes in global optimization. AIP Conf. Proc. 1776, 060011 (2016)
11. Grishagin, V., Israfilov, R., Sergeyev, Y.: Convergence conditions and numeri-
cal comparison of global optimization methods based on dimensionality reduction
schemes. Appl. Math. Comput. 318, 270–280 (2018)
12. Jones, D., Perttunen, C., Stuckman, B.: Lipschitzian optimization without the
Lipschitz constant. J. Optim. Theory Appl. 79(1), 157–181 (1993)
13. Jones, D.R.: The direct global optimization algorithm. In: The Encyclopedia of
Optimization, pp. 725–735. Springer, Heidelberg (2009)
14. Kvasov, D.E., Mukhametzhanov, M.S.: Metaheuristic vs. deterministic global opti-
mization algorithms: the univariate case. Appl. Math. Comput. 318, 245 – 259
(2018)
15. Paulavičius, R., Žilinskas, J.: Advantages of simplicial partitioning for Lipschitz
optimization problems with linear constraints. Optim. Lett. 10(2), 237–246 (2016)
16. Paulavičius, R., Žilinskas, J., Grothey, A.: Investigation of selection strategies in
branch and bound algorithm with simplicial partitions and combination of Lips-
chitz bounds. Optim. Lett. 4(2), 173–183 (2010)
17. Sergeyev, Y., Grishagin, V.: Sequential and parallel algorithms for global optimiza-
tion. Optim. Method. Softw. 3(1–3), 111–124 (1994)
18. Sergeyev, Y., Grishagin, V.: Parallel asynchronous global search and the nested
optimization scheme. J. Comput. Anal. Appl. 3(2), 123–145 (2001)
19. Sergeyev, Y., Kvasov, D.: A deterministic global optimization using smooth diag-
onal auxiliary functions. Commun. Nonlinear Sci. Numer. Simul. 21(1–3), 99–111
(2015)
20. Sergeyev, Y., Kvasov, D., Mukhametzhanov, M.: On the efficiency of nature-
inspired metaheuristics in expensive global optimization with limited budget. Sci.
Rep. 8(1), 435 (2018)
21. Sergeyev, Y., Mukhametzhanov, M., Kvasov, D., Lera, D.: Derivative-free local
tuning and local improvement techniques embedded in the univariate global opti-
mization. J. Optim. Theory Appl. 171(1), 186–208 (2016)
22. Sergeyev, Y.D., Strongin, R.G., Lera, D.: Introduction to Global Optimization
Exploiting Space-filling Curves. Springer Briefs in Optimization. Springer, New
York (2013)
23. Strongin R.G., Sergeyev Y.D.: Global Optimization with Non-convex Constraints.
Sequential and Parallel Algorithms. Kluwer Academic Publishers, Dordrecht (2000)
24. Yang, X.S.: Nature-Inspired Metaheuristic Algorithms. Luniver Press, Frome
(2008)
25. Zhigljavsky, A., Žilinskas, A.: Stochastic Global Optimization. Springer, New York
(2008)
A B-Spline Global Optimization
Algorithm for Optimal Power Flow
Problem
Deepak D. Gawali1(B) , Bhagyesh V. Patil2,3 , Ahmed Zidna4 , and Paluri S. V.

Nataraj5
1
Vidyavardhini’s College of Engineering and Technology, Palghar, India
deepak.gawali@vcet.edu.in
2
Cambridge Centre for Advanced Research and Education In Singapore,
Singapore, Singapore
PatilBhagyesh@JohnDeere.com
3
John Deere Technology Centre, Magarpatta City, Pune, India
4
LGIPM, University of Lorraine, Metz, France
ahmed.zidna@univ-lorraine.fr
5
Systems and Control Engineering, Indian Institute of Technology Bombay,
Mumbai, India
nataraj@sc.iitb.ac.in
Abstract. This paper addresses a nonconvex optimal power flow prob-

lem (OPF). Specifically, a new B-spline approach in the context of OPF
problem is introduced. The applicability of this new approach is shown
on a real-world 3-bus power system. The numerical results obtained with
this new approach for this problem a 3-bus system reveal a satisfactory
improvement in terms of optimality when compared against traditional
interior-point method based MATPOWER toolbox. Similarly, the results
are also found to be satisfactory with respect to the global optimization
solvers like BARON and GloptiPoly.
Keywords: Polynomial B-spline · Global optimization ·

Polynomial optimization · Constrained optimization
1 Introduction
The optimal power flow (OPF) has a rich research history since it was first
introduced by Carpentier in 1962 [1]. In practice, the OPF problem aims at
minimizing the electric generator fuel cost to meet the desired load demand
for power system under various operating conditions, such as system thermal
dissipation, voltages and powers.
Briefly, the classical formulation for the OPF problem can be stated as fol-
lows:

https://doi.org/10.1007/978-3-030-21803-4_6
A B-Spline Global Optimization Algorithm for Optimal Power Flow Problem 59

min f (x) (Fuel cost of generation)
x
s.t.g(x) ≤ 0. (Branch flow limits) (1)
h(x) = 0. (Nonlinear power balance equations)
xmin ≤ x ≤ xmax , (Bound on the decision variables)
Here x = [θ V Pg Qg ]T , and θ is a voltage angle, V is a voltage magnitude, Pg

is a real power generation, and Qg is a reactive power generation.
In the literature several solution approaches have been investigated to solve
the OPF problem (1). This includes linear programming, Newton-Raphson,
quadratic programming, Lagrange relaxation, interior-point methods, genetic
algorithms, evolutionary programming and particle swarm optimization [2–7].
However, it may be noted that the OPF problem is generally nonconvex in nature
and multiple number of local optimum solutions exist (cf. [10]). Hence, the above
mentioned solution approaches, which are based on a convexity assumption of
the optimization problem, may result in a local optimal fuel cost value. As such
it is of paramount interest to look for an alternative solution approach that can
guarantee global optimality in the fuel cost value of (1).
We note that the OPF problem possesses a polynomial formulation, i.e. func-
tions f, g, and h in (1) are polynomial functions in x. Based on this fact, in the
present work, we investigate a new approach for polynomial global optimization
of the OPF problem. It is based on the polynomial B-spline form and uses sev-
eral attractive geometric properties associated with it. Specifically, we use the
polynomial B-spline form for higher order approximation of the OPF problem.
It is noteworthy that the B-spline coefficients provide lower and upper bounds
for the range of a function. Generally, this bound is over-estimated, but can be
improved either with raising degree of B-spline or with an additional spline (or
control) points. In this work, we particularly increase the number of B-spline
control points to get sharper bounds on the range of the OPF problem. Fur-
ther, we incorporate the obtained B-spline approximation for the OPF problem
in a classical interval branch-and-bound framework. This enables us to locate a
correct global optimal fuel cost value for the OPF problem to a user-specified
accuracy.
The rest of the paper is organized as follows. In Sect. 2, we give briefly the
notations and definitions of the polynomial B-spline form and describe the uni-
variate and multivariate cases, and B-spline global algorithm. In Sect. 3, we
report the numerical experiments performed with the B-spline global optimiza-
tion algorithm on a 3-bus power system and compare the quality of the global
minimum with those obtained using the well-known NLP solvers BARON [12],
GloptiPoly [13] and MATPOWER [11]. In Sect. 4, we give the conclusion of our
work.
2 Background: Polynomial B-Spline Approach for Global
Optimization
In this section, we briefly introduce polynomial B-spline form [17]. This polyno-
mial B-spline form is the basis of the main B-spline global optimization algorithm
reported in Sect. 2.3.
60 D. D. Gawali et al.
Let s ∈ N be the number of variables and x = (x1 , x2 , ..., xs ) ∈ Rs . A

s
multi-index I is defined as I := (i1 , i2 , ..., is ) ∈ (N ∪ {0}) and multi-power xI
I i1 i2 is
is defined as x := (x1 , x2 , ..., xs ). Given a multi-index N := (n1 , n2 , ..., ns )
and an index r, we define Nr,−l = (n1 , ....., nr−1 , nr − l, nr+1 , ...., ns ), where
0 ≤ nr − l ≤ nr . Inequalities I ≤ N for multi-indices are meant componentwise,
i.e. il ≤ nl , l = 1, 2, ..., s. With I = (i1 , ..., ir−1 , ir , ir+1 , ..., is ) we associate the
index Ir,l given by Ir,l = (i1 , ..., ir−1 , ir + l, ir+1 , ..., is ), where 0 ≤ ir + l ≤ nr .
A real bounded and closed interval xr is defined as xr ≡ [xr , xr ] := [inf xr =
min xr , sup xr = max xr ] ∈ IR, where IR denotes the set of compact intervals.
Let wid xr denotes the width of xr , that is wid xr := xr − xr .
Let I = [a, b], m (degree of polynomial B-spline) and k (number of B-spline
segments) given positive integers, and u = {xi }k+m i=−m , h = (b − a)/k with mesh
length, be a uniform grid partition defined by
x−m = x−m+1 = · · · = x0 ,
xi = a + ih, for i = 1, . . . , k,
xk+1 = xk+2 = · · · = xk+m .
Then the associated polynomial spline space of degree m is defined by
Sm (I, u) = {s ∈ C m−1 (I) : s|[xi ,xi+1 ] ∈ Pm },
where Pm is the space of polynomials of degree at most m. It is well known

that the set of the classical normalized B-splines {Nim , i = −m, . . . , k − 1} is a
basis for Sm (I, u) that satisfies interesting properties; for example, each Nim is
positive on its support and {Nim }k−1
i=−m form a partition of unity. On the other
hand, as Pm ⊂ Sm (I, u), the power basis functions {xr }m 0 can be expressed in
terms of B-splines through the relations
k−1

xt = πjt Njm (x), t = 0, . . . , m, (2)
j=−m
where πjt are the symmetric polynomials given by
Symt (j + 1, . . . , j + m)
πjt = m for t = 0, . . . , m. (3)
t
The B-splines can be computed by the recurrence formula
Nim (x) = γi,m (x)Nim−1 (x) + (1 − γi+1,m (x))Ni+1

m−1
(x), (4)
where ⎧
x − xi
⎨ , if xi ≤ xi+m ,
γi,m (x) = xi+m − xi (5)
⎩ 0, otherwise,
and
1, if x ∈ [xi , xi+1 ),
Ni0 (x) := (6)
0, otherwise.
In order to easily compute bounds for a range of a multivariate polynomial

of degree N over an s-dimensional box, one can derive its B-spline representation
[14,15].
p(x) = aI xI , x ∈ Rs . (7)
I≤N
2.1 Univariate Case
Firstly, we consider a univariate polynomial

n

p (x) := at xt , x ∈ [a, b] , (8)
t=0
to be expressed in terms of the B-spline basis of the space of polynomial splines

of degree m ≥ n (i.e. order m + 1). By substituting (2) into (8) we get

n

n (t) m
k−1
k−1
at πj Njm (x)
(t)
p(x) = at πj Nj (x) =
t=0 j=−m j=−m t=0
(9)

k−1
= dj Njm (x),
j=−m
where
n
(t)
dj at πj . (10)
t=0
2.2 Multivariate Case
Now, we derive the B-spline representation of a given multivariate polynomial

(7)
n1
ns

p (x1 , x2 , ..., xs ) = ... ai1 ...is xi11 ...xiss = aI xI , (11)
i1 =0 is =0 I≤N
where I := (i1 , i2 , ..., is ), and N := (n1 , n2 , ..., ns ). By substituting (2) for each
xt , (11) can be written as

n1
ns k1 −1 ks −1
πj11 Njm1 1 (x1 ) ... πjss Njms s (xs )
(i ) (i )
p (x1 , ..., xs ) = ... ai1 ...is
i1 =0 is =0 j1 =−m1 js =−ms
k k

1 −1 s −1
n1
ns
...Njm1 1 (x1 ) ...Njms s (xs )
(i1 ) (is )
= ... ... ai1 ...is πj1 ....πjs
j1 =−m1 js =−ms i1 =0 is =0
k
1 −1 k
s −1
= ... dj1 ...js Njm1 1 (x1 ) ...Njms s (xs ) ,
j1 =−m1 js =−ms
(12)
we have expressed p as
p(x) = dI (x)NIN (x), (13)
I≤N
with the coefficients dI (x) given by

n1
ns
(i ) (i )
dj1 ,...,js = ... ai1 ...is πj11 ....πjss . (14)
i1 =0 is =0
Global optimization of polynomials using the polynomial B-spline approach

needs transformation of the given multivariate polynomial from its power form
into its polynomial B-spline form. Then B-spline coefficients are collected in an
array D(x) = (dI (x))I∈S , where S = {I : I ≤ N}. This array is called a patch.
Let p be a polynomial of degree N and let p̄(x) denote the range of p on
the given domain x. Then, for a patch D(x) of B-spline coefficients it holds
[8,9,14,15],
p̄(x) ⊆ [min D(x), max D(x)], (15)
Then the B-spline range enclosure for the polynomial p, is given as
p̄(x) = D(x).
Remark 1. Above equation (15) says that the minimum and maximum B-spline
coefficients of multivariate polynomial p on x obtained by transforming it from
the power form to B-spline form, provides an enclosure for the range of the p.
We shall obtain such B-spline transformation for the OPF problem (1), followed
by a interval branch-and-bound procedure to locate correct global optimal solution
for (1).
2.3 Main B-Spline Global Optimization Algorithm

A classical algorithm for deterministic global optimization uses exhaustive search
over the feasible region using interval branch-and-bound procedure. Typically,
the interval branch-and-bound has means to compute upper and lower bounds
(based on the inclusion function), followed by a divide and conquer approach.
Similar, to a interval branch-and-bound algorithm, our B-spline global optimiza-
tion algorithm uses B-spline range enclosure property as a inclusion function.
This provides upper and lower bounds on an instance of a optimization prob-
lem. If both bounds are within the user-specified accuracy, then an optimal
solution has been found. Otherwise, the feasible region is divided into two or
more sub-regions, and the same procedure is applied to each of the sub-regions
until termination condition is satisfied.
Below we give a pseudo-code description of the polynomial B-spline global
optimization algorithm (henceforth, referred to as B-spline Opt).
3 Case Study: 3-Bus Power System
In this section, we report the numerical results using the polynomial B-spline
global optimization algorithm (B-spline Opt) applied to a 3-bus power system
shown in Fig. 1. Specifically, we show the optimality achieved in the result,
i.e. quality of global minimum, f ∗ in (1) with respect to different solution
approaches. First, we compare against the interior-point method based MAT-
POWER toolbox [4] used to solve the OPF problem. Further, to validate our
Algorithm 2.1. B-spline Opt (Ac , Nc , Kc , x, , zero )

Input : Here Ac is a cell structure containing the coefficients array of
objective and all the constraints polynomial, Nc is a cell struc-
ture, containing degree vector N for objective and all constraints.
Where elements of degree vector N defines the degree of each vari-
able occurring in objective and all constraints polynomial, Kc is a
cell structure containing vectors corresponding to objective poly-
nomial, Ko and all constraints, i.e. Kgi , Khj . Where elements of
this vector define the number of B-spline segments in each vari-
able direction, the initial box x, the tolerance limit and tolerance
parameter zero to which the equality constraints are to be satis-
fied.
Output: Global minimum p̂ and all the global minimizers z (i) in the initial
search box x to the specified tolerance .
Begin Algorithm
1 {Compute the B-spline segment numbers}
For each entry of K in Kc , compute K = N + 2.
2 {Compute the B-spline coefficients}
Compute the B-spline coefficients array for objective and constraints
polynomial on initial search domain x i.e. Do (x), Dgi (x) and Dhj (x)
respectively.
3 {Initialize current minimum estimate}
Initialize the current minimum estimate p̃ = max Do (x).
4 {Set flag vector}
Set F = (F1 , . . . , Fp , Fp+1 , . . . , Fp+q ) := (0, . . . , 0).
5 {Initialize lists}
L ← {x, Do (x), Dgi (x), Dhj (x), F }, Lsol ← {}
6 {Sort the list L}
Sort the list L in descending order of (min Do (x)).
7 {Start iteration}
if L = ∅ then
go to 12
else
pick the last item from L, denote it as
{b, Do (b), Dgi (b), Dhj (b), F }, and delete this item entry from L.
end
8 {Perform cut-off test}
Discard the item {y, Do (y), Dgi (y), Dhj (y), F }
if min Do (y) > p̃ then
go to 7
end
9 {Subdivision decision}
if (wid b) & (max Do (b) − min Do (b)) < then
enter the item {b, min D0 (b)} to Lsol & go to 7
else
go to 10
end
10 {Generate two sub boxes}
Choose the subdivision direction along the longest direction of b and
the subdivision point as the midpoint. Subdivide b into two subboxes
b1 and b2 such that b = b1 ∪ b2 .
11 for r ← 1 to 2
(a) {Set flag vector}
Set F r = (F1r , . . . , Fpr , Fp+1
r r
, . . . , Fp+q ) := F
(b) {Compute B-spline coefficients and corresponding B-spline range
enclosure for br }
Compute the B-spline coefficient arrays of objective and
constraints polynomial on box br and compute corresponding
B-spline range enclosure Do (br ), Dgi (br ) and Dhj (br ) for objective
and constraints polynomial.
(c) {Set local current minimum estimate}
Set p̃local = min(Do (br ))
(d) if (p̃local < p̃) then
(I) for i ← 1 to p do
if (Fi = 0) & (Dgi (br ) ≤ 0) then
Fir = 1
end
end
(II) for j ← 1 to q do
if (Fp+j = 0) & (Dhj (br ) ⊆ [−zero , zero ]) then
r
Fp+j =1
end
end
end
(e) if F r = (1, . . . , 1) then
set p̃ := min(p̃, max(Do (br )))
end
(f ) Enter {br , Do (br ), Dgi (br ), Dhj (br ), F r } into the list L.
end
12 {Compute the global minimum}

Set the global minimum to the current minimum estimate p̂ = p̃.
13 {Compute the global solution}
Find all those items in Lsol for which min Do (b) = p. The first entries
of these items are the global minimizer(s) z(i) .
14 return the global minimum p̂ and all the global minimizers z(i) found
above.
End Algorithm
numerical results we choose global optimization solvers like BARON and Glop-
tiPoly. For BARON the 3-bus system is modeled in GAMS and solved via the
NEOS server for optimization [16].
The B-spline Opt algorithm is implemented in MATLAB and the OPF
instance for the 3-bus system is also modeled in the MATLAB. It is solved
on a PC with an Intel-core i3-370M CPU processor running at 2.40 GHz with
a 6 GB RAM. The termination accuracy and equality constraint feasibility
tolerance zero are set to 0.001.
Table 1 shows the numerical results (global minimum, f ∗ ) with different solu-
tions approaches. We found the B-spline Opt algorithm is able to find a better
optimal solution with respect to the MATPOWER. It is worth noting that prac-
tically this accounts for around 3 $/hr savings in the fuel cost required for the
electricity generation. Our numerical results are further validated with respect to
BARON (cf. Table 1). We further note that, for GloptiPoly, the relaxation order
needs to systematically increased to obtain the convergence to the final result.
However, GloptiPoly exhausts the memory even with small relaxation order (in
this case just 2).
Fig. 1. 3-bus power system.

Table 1. Comparison of the optimal fuel cost value (f ∗ ) in (1) for a 3-bus system with
the different solution approaches.
Solver/Algorithm f*($/hr)
B-spline Opt 5703.52∗
BARON 5703.52∗
∗
GloptiPoly ∗
MATPOWER 5707.11∗ ∗ ∗
∗
Indicates the best obtained fuel cost
value. ∗∗ Indicates that the solver did
not give the result even after one hour
and therefore terminated. ∗∗∗ Indicates the
local optimal fuel cost value
Remark 2. In practice, a local optimum exists for the OPF problem. Reference
[10] shows for small bus power systems where voltage is within practical limits,
standard fixed-point optimization packages, such as MATPOWER converges to a
local optimum. Similarly, we observed that global optimization software package
GloptiPoly successfully solved the OPF instance of 3-bus power system. However,
it took significantly large amount of computational time to report the final global
optimum.
4 Conclusions
This paper addressed the important planning problem in power systems, termed
as optimal power flow (OPF). A new global optimization algorithm based on
the polynomial B-spline was proposed for solving the OPF problem. The appli-
cability of the B-spline algorithm was demonstrated on the OPF instance corre-
sponding to a real-world 3-bus system. A notable savings in the fuel cost (3 $/hr)
was achieved using B-spline algorithm with respect to traditional MATPOWER
toolbox. Similarly, the results obtained using proposed B-spline algorithm are
further validated against the generic global optimization solver BARON and
were found to be satisfactory.
References
1. Capitanescu, F.: Critical review of recent advances and further developments
needed in AC optimal power flow. Electr. Power Syst. Res. 136, 57–68 (2016)
2. Huneault, M., Galiana, F.: A survey of the optimal power flow literature. IEEE
Trans. Power Syst. 6(2), 762–770 (1991)
3. Torres, G., Quintana, V.: Optimal power flow by a nonlinear complementarity
method. In: Proceedings of the 21st IEEE International Conference Power Industry
Computer Applications (PICA’99), 211–216 (1999)
4. Wang, H., Murillo-Sanchez, C., Zimmerman, R., Thomas, R.: On computational

issues of market-based optimal power flow. IEEE Trans. Power Syst. 22(3), 1185–
1193 (2007)
5. Momoh, J.: Electric Power System Applications of Optimization. Markel Dekker,
New York (2001)
6. Momoh, J., El-Hawary, M., Adapa, R.: A review of selected optimal power flow
literature to 1993. Part I. Nonlinear and quadratic programming approaches. IEEE
Trans. Power Syst. 14(1), 96–104 (1999)
7. Momoh, J., El-Hawary, M., Adapa, R.: A review of selected optimal power flow
literature to 1993. Part II. Newton, linear programming and interior point methods.
IEEE Trans. Power Syst. 14(1), 105–111 (1999)
8. Gawali, D., Zidna, A., Nataraj, P.: Algorithms for unconstrained global opti-
mization of nonlinear (polynomial) programming problems: the single and multi-
segment polynomial B-spline approach. Comput. Oper. Res. Elsevier 87, 201–220
(2017)
9. Gawali, D., Zidna, A., Nataraj, P.: Solving nonconvex optimization problems in
systems and control: a polynomial B-spline approach. In: Modelling, Computation
and Optimization in Information Systems and Management Sciences, Springer,
467–478 (2015)
10. Bukhsh, W., Grothey, A., McKinnon, K., Trodden, P.: Local solutions of the opti-
mal power flow problem. IEEE Trans. Power Syst. 28(4), 4780–4788 (2013)
11. Zimmerman, R., Murillo-Sanchez, C., Thomas, R.: MATPOWER: Steady-state
operations, planning, and analysis tools for power systems research and education.
IEEE Trans. Power Syst. 26(1), 12–19 (2011)
12. Tawarmalani, M., Sahinidis, N.: GloptiPoly: a polyhedral branch-and-cut approach
to global optimization. Math. Program. 103(2), 225–249 (2005)
13. Henrion, D., Lasserre, J.: GloptiPoly: global optimization over polynomials with
Matlab and SeDuMi. ACM Trans. Math. Softw. (TOMS) 29(2), 165–194 (2003)
14. Lin, Q., Rokne, J.: Methods for bounding the range of a polynomial. J. Comput.
Appl. Math. 58(2), 193–199 (1995)
15. Lin, Q., Rokne, J.: Interval approximation of higher order to the ranges of functions.
Comput. Math. Appl. 31(7), 101–109 (1996)
16. NEOS Server for optimization. http://www.neos-server.org/neos/solvers/
17. De Boor, C.: A Practical Guide to Splines. Applied Mathematical Sciences.
Springer, Berlin (2001)
Concurrent Topological Optimization
of a Multi-component Arm for a Tube
Bending Machine
Federico Ballo(B) , Massimiliano Gobbi , and Giorgio Previati
Politecnico di Milano, 20156 Milan, Italy

{federicomaria.ballo,giorgio.previati}@polimi.it
Abstract. In this paper the problem of the concurrent topological opti-

mization of two different bodies sharing a region of the design space is
dealt with. This design problem focuses on the simultaneous optimiza-
tion of two bodies (components) where not only the material distribution
of each body has to be optimized but also the design space has to be
divided among the two bodies. This novel optimization formulation rep-
resents a design problem in which more than one component has to be
located inside a limited allowable room. Each component has its function
and load carrying requirements.
The paper presents a novel development in the solution algorithm. The
algorithm has been already presented referring to the concurrent opti-
mization of two bodies where the same mesh is used for both bodies in
the shared portion of the domain. In this version of the algorithm, this
requirement has been removed and each of the two bodies can be meshed
with an arbitrary mesh. This development allows the application of the
method to any real geometry. The algorithm is applied to the design of
a multi-component arm for a tube bending machine.
Keywords: Structural optimization · Topology optimization ·

Multi-component system optimization · SIMP
1 Introduction
In the literature many different engineering problems referring to the topological
optimization of structural components can be found (see for instance [3–5,9,
13,15]). In most cases, only one component is considered. Recently, however,
problems involving multi-domain optimization, such as the design of multi-phase
or multi-material structures [12], have been considered.
Referring to the particular problem of the optimization of systems com-
posed by two bodies sharing the same design domain, some applications can
be found. In [7,11], the level set method is used for the optimization of the

https://doi.org/10.1007/978-3-030-21803-4_7
Concurrent Topological Optimization of a Multi-component 69
interface between two, or more, phases (components) in order to obtain some

prescribed interaction force. In [14], a completely different approach, the Mov-
ing Morphable Components method, which seems to be adaptable to solve such
problems, is discussed.
In [2,8], the authors have presented a simple algorithm based on the SIMP
approach [3,9]. The method is able to optimize the material distribution of two
different bodies while, at the same time, allocate in an efficient way the space
between the bodies. The algorithm has also proved to be easily implementable
and can be used along with available optimization algorithms. The applicability
of the algorithm is limited to bodies presenting the same mesh in the shared
portion of the domain. Such limitation prevents the utilization of the algorithm
on arbitrary shaped optimization domains. The present paper aims to solve this
limitation, extending the applicability of the algorithm to any design domain and
any kind of finite element (Sect. 2). The algorithm will be tested by considering
a simple two dimensional example (Sect. 3). Then, in Sect. 4, the application of
the presented algorithm to the optimization of a multi-component arm for a tube
bending machine will be shown.
2 Problem Formulation
The formulation of the concurrent structural optimization of two bodies sharing

the same design space has been given and discussed in [8]. In this section, it is
briefly summarized.
Figure 1 depicts the two bodies sharing (a portion of) the design space. The
following nomenclature is considered.
– Ω1 and Ω2 : bodies to be optimized.

– Ω1−2 : shared portion of the design space that can be occupied by any of the
two bodies (or left void). Any given point of this region, can be occupied by
only one of the two bodies.
– Ω1∗ and Ω2∗ : the two unshared parts of the domains Ω1 and Ω2 respectively
(s.t. Ω1 = Ω1∗ + Ω1−2 and Ω1 = Ω1∗ + Ω1−2 ).
(1) (2)
– Ω1−2 and Ω1−2 : portion of the shared portion of the design space assigned to
Ω1 or Ω2 respectively.
– Ω1 and Ω2 : actual design space of the two bodies, given by Ω1 = Ω1∗ ∪ Ω1−2
(1)
and Ω2 = Ω2∗ ∪ Ω1−2

(2)
– f1 and f2 : applied load on Ω1 and Ω2 respectively.

– Γ1 and Γ2 : boundary conditions on Ω1 and Ω2 respectively.
Obviously, for a physical meaningful solution of the problem, Ω1 and Ω2 must

be connected.
It must be emphasized that the portion of the shared domain Ω1−2 to be
(1) (2)
allocated to each body (i.e. Ω1−2 and Ω1−2 ) is not given a priori, but must
be allocated during the optimization process and can change according to the
evolution of the shapes of the two bodies themselves.
70 F. Ballo et al.
Fig. 1. Generalized geometries of the design domains (Ω1 and Ω2 ) of two bodies sharing
a portion of the design space. Each body has its own system of applied forces (f1 and
f2 ) and boundary constraints (Γ1 and Γ2 ). Ω1−2 represents the shared portion of the
design space. Left: initial definition of the domains. Right: division of the domains with
assignment of the shared portion of the design space to each body.
Under the hypotheses of

– linear elastic bodies with small deformations;
– the two bodies do not interact in the shared portion of the domain (i.e. no
contact is considered between the two bodies in the shared portion of the
domain, interactions between the two bodies outside this region is possible);
– each body is made by only one isotropic material;
– both materials have the same reference density, but they can have different
elastic moduli;
– loads and boundary conditions can be applied at any location of the domains,
even in the shared portion;
– the structural problem is formulated by the finite element theory;
the problem can be stated in the framework of finite elements as

F ind minu1 ,u2 ,ρ f1T u1 + f2T u2

s.t. : K Ee1 u1 = f1 and K Ee2 u2 = f2
p
Ee1 = ρ (xe ) Ee∗1 , xe ∈ Ω1
p
Ee2 = ρ (xe ) Ee∗2 , xe ∈ Ω2
(1) (2) (1) (2)
Ω1−2 ∪ Ω1−2 = Ω1−2 and Ω1−2 ∩ Ω1−2 = ∅ (1)
1
N 2
N
ρ (xe ) + ρ (xe ) ≤ V 0 < ρmin ≤ ρ ≤ 1
e=1 e=1
N1 N2
where : K Ee = e=11
Ke Ee1 and K Ee2 = e=1 Ke Ee2
Ω1 and Ω2 connected
where u1 and u2 are the displacement fields of the two bodies, xe are the coor-
dinates of the centre of the considered element, K is stiffness matrix of each
body with Ke element stiffness matrix, Ee1 and Ee2 are the elastic moduli of each
element of the two bodies, with Ee∗1 and Ee∗2 reference elastic moduli, N1 and
Fig. 2. Diagram of the algorithm for the concurrent topological optimization. Symbols
refer to Fig. 1.
N2 are the number of elements of each body and ρ is the pseudodensity, with p
the penalty term.
In [2,8], the authors have presented a solution algorithm able to solve the
problem stated in Eq. 1 under the condition that the two bodies present the
same mesh in the shared part of the domain, i.e. in this region there is a one
to one correspondence of each element of the two bodies. In the following, this
condition is removed.
The modified solution algorithm is reported in the diagram of Fig. 2. The
algorithm is basically divided in two parts, i.e. the dashed rectangles labeled A
and B in Fig. 2.
The sub-algorithm A represents a standard topology optimization algorithm
(see [3]). This sub-algorithm follows the solution of the finite element model
72 F. Ballo et al.
of the two bodies (block 1 in Fig. 2). The sub-algorithm B is the part of the
algorithm devoted to the allocation of the shared part of the domain (Ω1−2 ) to
each of the two bodies.
The sub-algorithm B implements the following steps.
– Interpolation of the sensitivity fields on Ω1−2 computed in Block A1. Interpo-

lation is required as there is no correspondence between the elements of the
two bodies. To interpolate the sensitivities, the value of sensitivity computed
on each element is normalized with respect to the volume of the element. The
interpolation is performed on local patches.
– Comparison of the two sensitivities and allocation of the shared design space.
(1)
The shared design is allocated to Ω1−2 where the sensitivity computed on Ω1
(2)
is greater than the sensitivity computed on Ω2 or to Ω1−2 otherwise.
– Construction of Ω1 = Ω1∗ ∪ Ω1−2 and Ω2 = Ω2∗ ∪ Ω1−2 .
(1) (2)
– Imposition of the connectivity on Ω1 and Ω2 . Each non-connected region of

the two sets, in case present, is found and assigned to the other set. In this
way, the two sets result connected.
– The elements of Ω1 and Ω2 belonging to Ω1 and Ω2 respectively are found
and marked as active. The remaining elements are marked as non active.
The information on the active or non-active elements is transferred to Block
A3 and the new pseudodensity of each element is computed by the standard
method described in [10]. The target volume fraction is enforced at system
level and each body can have a different value of volume fraction.
– Penalization of the non-active elements of the shared part of the domain, i.e.
their pseudensity is set equal to the minimum value of pseudodensity.
After the sub-algorithm B is completed, the new finite element model is con-
structed and the solution algorithm is repeated until convergence or the maxi-
mum number of iterations is reached.
3 Two Dimensional Problem
In this section a simple two dimensional problem is reported. The problem is

depicted in Fig. 3 and consists of two cantilevers with an end load sharing part
of the domain. The two cantilevers have the same geometry, material, load and
boundary conditions. However, the two domains are meshed with very different
meshes. Domain Ω1 is meshed by three node triangular elements with a non
structured distribution and mean element size of 1 mm. Domain Ω2 is meshed
by six node triangular elements with a structured distribution and mean element
size of 1 mm. At the first step of the analysis, the same pseudodensity has been
assigned to all the elements of the two bodies, including the elements in the
shared part of the domain.
50
1
40 F=100
2
30
Dimension [unit] 20
1-2
10
F=100
-10
0 20 40 60 80
Dimension [unit]
Fig. 3. Two dimensional example definition.
In Fig. 4 the results of the concurrent topological optimization of the two

cantilevers are shown. The figure shows both the division of the common part of
the design space and the two optimized structures. The two optimized structures
are very similar, as one may expect as the two sub-problems are the same. In
fact, the mass difference is 0.17% and the difference in compliance is 0.42%. The
obtained topology is also similar to the theoretical solution of the problem of
the cantilever with load end ([3] p. 49).
4 Concurrent Topological Optimization of a Tool Support

Swing Arm
The described algorithm has been employed for the optimization of a tool sup-
port swing arm for a tube bending machine. This research activity has been
completed in collaboration with BLM Group [1]. Due to the very high produc-
tion rate of the machine, the swing arm is subjected to high accelerations and,
as a consequence, an inertial torque arises. The optimization of the system is
thus important to reduce energy consumption and increase the production rate.
In Fig. 5, the tool support swing arm is depicted. The figure shows only half
of the model due to the symmetry in the geometry of the system. The arm is
composed by two parts, namely the support arm which rotates around a vertical
rotation axis and the sledge sliding on a guide rail on the support arm. The tool
load is applied to the sledge by a multi-point constraint. A contact interaction is
imposed between the sledge and the support arm in correspondence of the guide
rail. It is worth noting that, by including this contact condition, a non linear
finite element analysis has to be run for each optimization step. A screw drive
is moved by a motor in order to position the sledge with respect to the support
arm. The screw is actuated by a motor connected to a gear.
74 F. Ballo et al.
Fig. 4. Solution of the two dimensional problem in Fig. 3.
Fig. 5. Tool support swing arm model [1].
The system is subjected to an angular acceleration around the rotation axis.

Also, three different loads are considered. The loads are applied one at the time
and for different positions of the sledge along the rail. Forces F2 and F3 act
only in the y direction, while force F1 has x and y components. Due to the
angular acceleration and force F1 , the system undergoes a non symmetric defor-
mation. To maintain the geometrical symmetry of the components, a symmetry
constraint is applied in the optimization algorithm. The symmetry constraint
is enforced as explained in [6]. The components are manufactured by additive

manufacturing, thus no manufacturability constraint is considered.
The optimization domains of the support arm and of the sledge are also
reported in Fig. 5. In the figure, a relatively small region is highlited in green.
This region is quite critical for the stiffness of both components and cannot be
used by both as they would interfere when the sledge is in the most inner posi-
tion. This region is considered as shared domain that can be assigned to either
component during the optimization process. With respect to the algorithm of
Fig. 2, in the optimization of the tool support swing arm, not only the connect-
ness of the two design spaces is enforced, but an additional condition that the
shape of the domains must allow the sliding of the sledge is considered. The two
components to be optimized are meshed by four-nodes tetrahedrons with mean
size of 1.5 mm. A total of about 1.65 · 106 elements is present in the model.
The optimization problem is solved by using AbaqusTM 2016 for the solu-
tion of the non linear finite element model and a handwritten code to read the
results, implement the optimization algorithm and write the new input file. The
handwritten code is realized by using Python and MatlabTM . As in the previ-
ous example, at the beginning of the analysis, the same pseudodensity has been
assigned to all the elements.
The results of the optimization process, with a overall target volume fraction
of 0.55, is shown in Fig. 6. On the left, an overall view of the optimized system is
reported. On the right, the detail of the optimization of the two components in
the shared part of the domain can be seen. The shared domain has been divided
in a quite complex way, allowing an internal reinforcement for the sledge and a
lower reinforcement for the support arm.
Fig. 6. Optimization results - surfaces with pseudodensity greater than 0.3. Left: com-
plete system. Right: detail of the shared part of the domain.
5 Conclusion
In the present paper, an improved algorithm has been presented for the concur-
rent optimization of two bodies sharing part of the design domain. The improved
algorithm allows for the utilization of any arbitrary mesh on the bodies. Also,
76 F. Ballo et al.
the algorithm has been used with a commercial finite element software by con-
sidering a SIMP approach and a symmetry constraint in the solution, proving
the applicability of the method with existing optimization algorithm. In this
way, the algorithm can be applied to real world optimization problem.
The new algorithm has been tested on a simple two dimensional problem and
then applied to the optimization of the arm of a tube bending machine designed
in collaboration with BLM Group. The application has shown the ability of
the algorithm to solve real problems and find non trivial efficient solutions for
the assignment of the shared domain. Further developments of the method that
consider the possibility to include contact interactions in the shared part of the
domain will be investigated.
References
1. BLM Group. http://www.blmgroup.com. Accessed 24 Jan 2019
2. Ballo, F., Gobbi, M., Previati, G.: Concurrent topological optimisation: optimisa-
tion of two components sharing the design space. In: EngOpt 2018 Proceedings
of the 6th International Conference on Engineering Optimization, pp. 725–738.
Springer International Publishing, Cham (2019)
3. Bendsøe, M.P., Sigmund, O.: Topology Optimization. Theory, Methods, and Appli-
cations, 2nd edn. Springer Berlin (2004)
4. Eschenauer, H.A., Olhoff, N.: Topology optimization of continuum structures: a
review. Appl. Mech. Rev. 54(4), 331 (2001)
5. Guo, X., Cheng, G.D.: Recent development in structural design and optimization.
Acta Mech. Sin. 26(6), 807–823 (2010)
6. Kosaka, I., Swan, C.C.: A symmetry reduction method for continuum structural
topology optimization. Comput. Struct. 70(1), 47–61 (1999)
7. Lawry, M., Maute, K.: Level set shape and topology optimization of finite strain
bilateral contact problems. Int. J. Numer. Methods Eng. 113(8), 1340–1369 (2018)
8. Previati, G., Ballo, F., Gobbi, M.: Concurrent topological optimization of two
bodies sharing design space: problem formulation and numerical solution. Struct.
Multidiscip. Optim. (2018)
9. Rozvany, G.I.N.: A critical review of established methods of structural topology
optimization. Struct. Multidiscip. Optim. 37(3), 217–237 (2009)
10. Sigmund, O.: A 99 line topology optimization code written in matlab. Struct.
Multidiscip. Optim. 21(2), 120–127 (2001)
11. Strömberg, N.: Topology optimization of orthotropic elastic design domains with
mortar contact conditions. In: Schumacher, A., Vietor, T., Fiebig, S., Bletzinger,
K.-U., Maute, K. (eds.) Advances in Structural and Multidisciplinary Optimiza-
tion: Proceedings of the 12th World Congress of Structural and Multidisciplinary
Optimization, pp. 1427–1438. Springer International Publishing, Braunschweig,
Germany (2018)
12. Tavakoli, R., Mohseni, S.M.: Alternating active-phase algorithm for multimate-
rial topology optimization problems: a 115-line MATLAB implementation. Struct.
13. Zhang, W., Zhu, J., Gao, T.: Topology Optimization in Engineering Structure
Design. Elsevier, Oxford (2016)
14. Zhang, W., Yuan, J., Zhang, J., Guo, X.: A new topology optimization approach
based on Moving Morphable Components (MMC) and the ersatz material model.
Struct. Multidiscip. Optim. 53(6), 1243–1260 (2016)
15. Zhu, J.H., Zhang, W.H., Xia, L.: Topology optimization in aircraft and aerospace
structures design. Arch. Comput. Methods Eng. 23(4), 595–622 (2016)
Discrete Interval Adjoints in
Unconstrained Global Optimization
Jens Deussen(B) and Uwe Naumann
Software and Tools for Computational Engineering, RWTH Aachen University,

Aachen, Germany
{deussen,naumann}@stce.rwth-aachen.de
Abstract. We describe how to deploy interval derivatives up to second

order in the context of unconstrained global optimization with a branch
and bound method. For computing these derivatives we combine the
Boost interval library and the algorithmic differentiation tool dco/c++.
The differentiation tool also computes the required floating-point deriva-
tives for a local search algorithm that is embedded in our branch and
bound implementation. First results are promising in terms of utility of
interval adjoints in global optimization.
Keywords: Discrete adjoints · Algorithmic differentiation ·

Interval arithmetic · Branch and bound · Global optimization
1 Introduction
The preferred numerical method to compute derivatives of a given computer
code at a specified point by exploiting the chain rule and elemental symbolic
differentiation rules is algorithmic differentiation (AD) [1,2]. The tangent mode
of AD computes the Jacobian at a cost proportional to the number of argu-
ments. In case of a high-dimensional domain and a low-dimensional codomain
the adjoint mode is advantageous for the derivative computation as the costs are
proportional to the number of outputs. AD methods are successfully applied in
e.g. machine learning [3], computational finance [4], and fluid dynamics [5].
Interval arithmetic (IA) has the property that all values that can be evaluated
are reliably contained in the output of the corresponding interval evaluation on a
given domain. This has the advantage that instead of evaluating the function at
several points a single function evaluation in IA is required to obtain semi-local
information on the function value. Among others, IA can be used to estimate
errors with floating-point computations [6,7], and in optimization to find global
optima [8–10]. Branch and bound algorithms are often applied in this context.
Combining the the discrete differentiation techniques of AD and the inclusion
property of IA yields semi-local derivative information. This information can
e.g. be used to compute worst-case approximations of the error that can occur
in a neighborhood of an evaluation point. Another application field for interval
adjoints is approximate and unreliable computing [11].
https://doi.org/10.1007/978-3-030-21803-4_8
Discrete Interval Adjoints in Unconstrained Global Optimization 79
In this paper we compute discrete interval adjoints up to second order to

improve a simple branch and bound algorithm for unconstrained global opti-
mization. The interval derivative information is used to eliminate branches and
to converge faster. Another important task for branch and bound algorithms is
to updating the bound of the optimum. We apply different methods for this task
starting with interval information through to local optimization techniques.
The paper is organized as follows: Section 2 presents the well-established
basic concepts of interval arithmetic and algorithmic differentiation in a nutshell
as well as the combination of both techniques. In Sect. 3 the implemented branch
and bound algorithm is described. This algorithm is used in Sect. 4 to investigate
the utility of interval derivatives for global optimization. The last section gives
a conclusion and an outlook.
2 Methodology
2.1 Interval Arithmetic
IA is a concept that enables to compute bounds of a function evaluation on a
given interval. Since this chapter will only give a brief introduction to IA, the
reader might be referred to [6–8] to get more information about the topic.
We will use the following notation for an interval of a variable x with lower
bound x and upper bound x
[x] = [x, x] = {x ∈ R | x ≤ x ≤ x} .
The real numbers of the midpoint m [x] and the width w [x] are defined as
m [x] = 0.5 (x + x) , w [x] = x − x .
Evaluating an univariate scalar function y = g(x) in IA on [x] would result in
[y] = g [x] ⊇ {g(x) ∈ R | x ∈ [x]} .
The superset relation states that the interval [y] can be an overestimation of all
possible values on [x], but it guarantees that these values are contained. To ensure
this inclusion property arithmetic operations and elementary functions must be
redefined. More complex functions can be composed of these basic functions.
One reason for the already mentioned overestimation is that the underlying
data format (e.g. floating-point numbers) cannot represent the exact bounds. In
case of a lower bound IA rounds to negative infinity and in case of an upper
bound to positive infinity. Overestimation can also be caused by the dependency
problem. If a function evaluation uses a variable multiple times, IA does not take
into account that actual values taken from these intervals are equal. The larger
the intervals are the more significant the overestimation is.
Another problem of applying IA occurs if the implementation contains con-
ditional branches that depend on interval arguments. Comparisons of intervals
are only well-defined if these intervals do not intersect. By accepting further
80 J. Deussen and U. Naumann
overestimation a comparison of two intervals can be reduced to a comparison

of the difference with a scalar. But if the scalar is included in the interval the
comparison is still ambiguous.
Approaches to address this problem are either to compute both branches
of a condition or to evaluate the function on subdomains that cover the orig-
inal domain. The second approach is referred to as splitting or bisection [8].
Recursively refining the argument intervals results in some subdomains for which
the comparison is well-defined and some for which it is still ambiguous. Since
the dependency problem is dependent on the interval width, the overestimation
becomes smaller by evaluating the function on smaller domains.
2.2 Algorithmic Differentiation
AD techniques use the chain rule to compute additionally to the function value
of a primal implementation the derivative of the function values with respect
to arguments and intermediate variables at a specified point. Differentiability of
the underlying function is required for the application of AD.
In the following we will only consider multivariate scalar functions with n
arguments x and a single output y
f : Rn → R, y = f (x) . (1)
The first derivative of these functions is the gradient ∇f (x) ∈ Rn , and the second
derivative is the Hessian matrix ∇2 f (x) ∈ Rn×n . The next subsections will
briefly introduce the basic modes of AD. More detailed and general derivations
of these models can e.g. be found in [1,2].
Tangent Mode The tangent model can be derived by differentiating the func-
tion dependence. Thus, the model consists of the function evaluation in (1) and
∂y (1)
y (1) = x . (2)
∂xj j
Einstein notation implies summation over all values of j = 0, . . . , n − 1. Follow-

ing [2], the superscript (1) denotes the tangent of the variable. Equation (2) can
be interpreted as an inner product of the gradient and the tangent x(1) as
y (1) = ∇f (x) · x(1) .
For each evaluation with x(1) set to the i-th Cartesian basis vector ei in Rn
(also called seeding), an entry of the gradient can be extracted from y (1) (also
called harvesting). Using this model to get all entries of the gradient requires
n evaluations which is proportional to the number of arguments. The costs of
this method are similar to the costs of a finite difference approximation but AD
methods are accurate up to machine precision.
Adjoint Mode The adjoint mode is also called reverse mode, due to the reverse
computation of the adjoints compared to the computation of the values. There-
fore, a data-flow reversal of the program is required, to store additional informa-
tion on the computation (e.g. partial derivatives) [12], which potentially leads
to high memory requirements. The data structure to store this additional infor-
mation is often called tape.
Again following [2], first-order adjoints are denoted with a subscript (1) .
∂y
x(1),j = y(1) . (3)
∂xj
This equation is computed for each j = 0, . . . , n − 1. Note that the evaluation
of the primal (1) is also part of the adjoint models. The reverse mode yields a
product of the gradient with the adjoint y(1)
x(1) = y(1) · ∇f (x) . (4)
By seeding y(1) = 1 the resulting x(1) contains all entries of the gradient. A
single adjoint computation is required.
Higher Derivatives Higher derivatives of the primal implementation can be

derived by combining the two basic modes. The pure second-order tangent model
can be obtained by applying differentiation rules to the first-order tangent model
in (2) which yield
∂y (2)
y (2) = x ,
∂xj j
∂y (1,2) ∂2y (1) (2)
y (1,2) = xj + x x . (5)
∂xj ∂xj ∂xk j k
The complete model also contains the evaluation of (1) and (2). The (2) indicate
that the component belong to the second application of an AD mode. Seeding the
Cartesian basis of Rn for the tangents x(1) and x(2) independently and setting
all other components to zero the entries of the Hessian can be obtained with n2
evaluations of (5).
The other three combinations apply the adjoint mode at least once such that
the costs of the Hessian computation is different to the pure tangent model.
The Hessian can be computed with n second-order adjoint model evaluations.
The second-order adjoint model can be obtained by evaluating (1) and (3) and
applying the tangent mode to (3) which yields
∂y (2)
y (2) = x ,
∂xj j
(2) (2) ∂y ∂2y (2)
x(1),k = y(1) + y(1) x . (6)
∂xj ∂xk ∂xj j
Again, seeding the adjoint y(1) = 1, the tangent x(2) with the Cartesian basis of
Rn and everything else with zero results in a row of the Hessian in x(1) .
(2)
2.3 Interval Adjoint Algorithmic Differentiation

Combining the two concepts of interval computations and adjoint AD yields
additionally to the interval value of the function its interval valued gradient
with a single evaluation. Compared to the traditional approach of AD in which
the derivatives are only computed at specified points, we now get semi-local
derivative values. The interval gradient contains all possible values of the gradient
on the specified domain. These derivatives can be an overestimation of the real
derivative values as it was already stated for the interval values. Moreover, it is
possible to compute higher derivatives in IA, e.g. the Hessian. Since the interval
version of the second-order adjoint model looks the same as (1), (3) and (6)
except for the interval notation, we will omit that.
3 Branch and Bound with Interval Adjoints

Branch and bound algorithms [8] can be used to solve the unconstrained global
optimization problem on a previously defined domain D
y∗ = min f (x) . (7)

x∈D⊆Rn
The general idea of these algorithms is to partition the domain and try to remove
those parts that cannot contain a global minimum. Furthermore, the algorithms
find a bound for the global minimum y ∗ on domain D and return a subdomain
with desired precision that contains the minimum.
The algorithm that is referred to in this paper is described in Algorithm 1.
It uses a task queue Q to manage all (sub)domains that need to be analyzed. At
the beginning there will only be a single task in the queue (domain [x] = D).
The algorithm terminates if the queue is empty. In line 7 the tangent component
of [x] and the adjoint component of [y] are seeded. The adjoint model itself is
called in line 8. After that there are three verification of the following conditions
that need to be fulfilled at the global minimum:
1. The value must be less than any other value in the domain.
2. The first-order optimality condition requires that the gradient is zero.
3. The second-order optimality condition needs a positive-definite Hessian.
To eliminate those parts of the domain that cannot contain a global minimum
the these conditions are reformulated in IA: domains with a lower bound of the
function value y that is larger than the upper bound of the global minimum y ∗

and domains that does not contain zeros in the gradient intervals ∇[x] f are
removed. The third check removes a domain if the Hessian is not positive-definite

v · ∇2[x] f · v < 0 ∃v ∈ Rn .
(2)
The product of the interval Hessian with a random vector [x] (line 7) can be
(2)
harvested from [x](1) . This product is multiplied by the random vector again. If
the resulting interval is negative, the Hessian is not positive-definite.
Algorithm 1 Branch and Bound with interval adjoints

1: procedure BranchAndBound([x0 ] , X , N , L, y ∗ , LN )
2: y∗ ← ∞
3: Q ← [x0 ]
4: while Q = ∅ do
5: [x] ← Pop(Q)
6: try
(2)
7: [x](2) ← RandomVector( ), [y](1) ← 1.0, [y](1) ← 0.0

(2) (2) (2)
8: [y] , [y](2) , [x](1) , [x](1) ← f(1) ([x] , [x](2) , [y](1) , [y](1) )
9: if ValueCheckFails([y] , y ∗ ) or GradientCheckFails([x](1) )or
(2)
HessianCheckFails([x](2) , [x](1) ) then
10: eliminate([x])
11: else
12: y ∗ ←UpdateBoundGlMin([x], [y] , y ∗ )
13: if width([x]) > X then
14: Branch(Q, [x])
15: else
16: Push(L, [x])
17: end if
18: end if
19: catch exception
20: if width([x]) > N then
21: Branch(Q, [x])
22: else
23: Push(LN , [x])
24: end if
25: end try
26: end while
27: return L, y ∗ , LN
28: end procedure
In line 12 the bound for the global minimum y ∗ is updated. The implemented
branch and bound algorithm provides three methods to update this upper bound:
1. Store the minimal upper bound of the interval value y.
2. Store the smallest value evaluated at the midpoint of the domains f (m [x]).
3. Investigate a few gradient descent steps on the domain to advance towards a
(local) minimum and store the smallest function value.
While the first method only needs the already computed interval function value,
the other two methods require further function evaluations and the third even
requires derivatives in floating-point arithmetic.
If none of the previous checks failed and if the domain is still larger than
the desired precision X , the domain will be refined with a splitting (line 14) in
every direction. This procedure appends 2n new tasks to the task queue.
Whenever an undefined comparison occurs the IA implementation is assumed
to throw an exception which is handled in lines 6 and 19. The algorithm only
2 2
1 1
x[1]
x[1]
0 0
-1 -1
-2 -2
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
x[0] x[0]
Fig. 1. Isolines of the composed objective function (left) with non-smooth regions in
red and global optima in green. Results of the algorithm (right). Red domains can be
non-smooth, green domains can contain global optima, blue, orange and purple indicate
if the value, gradient or Hessian check failed, respectively.
splits these domains until a previously defined width N . This procedure has the
advantage that it will prevent the algorithm to generate huge amounts of tasks.
But it will also result in a surrounding of each potential non-smoothness that
needs to be investigated further by other techniques than the applied IA.
The implementation of the branch and bound algorithm uses the AD tool
dco/c++1 [13] to compute all required derivatives in interval as well as in floating-
point arithmetic. The Boost library [14] is used as an implementation of IA. Both
template libraries make use of the concept of operator overloading. Choosing
the interval datatype as a base type for the adjoint datatype results in the
desired first-order interval derivatives. Nesting the interval adjoint type into
a tangent type yields a interval with second-order derivative information. The
adjoint models are only evaluated if the value check passed. Furthermore, OpenMP
was used for its implementation of a task queue which also takes care about the
parallelization on a shared memory architecture.
The user of the branch and bound implementation can decide which of the
conditions described in the previous section should be verified. The software
only uses second-order adjoint types if the second-order optimality condition is
checked. Moreover, the user can select which method should be used to update
the bound of the global minimum. If the third method is chosen the user needs
to decide how many gradient descent steps should be performed for every task.
4 Case Studies
4.1 Ambiguous Control-Flow Branches
As a first test case the global minima of the six-hump camel function [15] are
computed. To show how the algorithm treats control-flow branches the imple-
1
https://www.nag.co.uk/content/adjoint-algorithmic-differentiation.
1.4 1.4
1.2 GD (2 log(w[x]/ X)) 1.2 GD (2 log(w[x]/ X))
1 GD (log(w[x]/ X)) 1 GD (log(w[x]/ X))
0.8 GD (16) 0.8 GD (16)
–y*
–y*
0.6 GD (4) 0.6 GD (4)
0.4 f(m[x]) 0.4 f(m[x])
0.2 y– 0.2
–
y
0 0
0 10 20 30 40 50 60 0 50 100 150 200 250 300 350
(a) run time in seconds (b) tasks in million
Fig. 2. Convergence of the bound for the minimum y ∗ for the different update
approaches over time (left) and over number of tasks (right). For the gradient descent
methods (GD) the number of performed steps is given in brackets.
mentation computes a quadratic function in both directions if the function value

of the six-hump camel function is greater than 10.
Figure 1 (left) shows isolines of the objective function and the highlighted
non-smooth area at y = 10 (red) as well as the two global optima located at
x = (0.0898, −0.7126) and x = (−0.0898, 0.7126) with value y ∗ = −1.0316
(green).
The initial domain is set to [x0 ] = [−3, 3] in both directions. All three condi-
tions are checked and the function value at the midpoint of a domain is used to
update the bound of the minimum. For purpose of visualization the algorithm
stops branching if the width of a subdomain is smaller than X = N = 0.1.
The results of the branch and bound algorithm are visualized in Fig. 1 (right).
Green domains can contain global minima. Red denotes a potential non-smooth
region. Blue domains failed the value check, while orange domains do not contain
a zero in the gradient interval. Purple is assigned to non-convex domains.
The two global and two of the local minima are still contained in the green
subdomains. The upper bound for these minima was computed to be y ∗ =
−1.0241 after 1545 tasks. The area around the control-flow branch is larger than
required due to the overestimation of IA.
4.2 Minimum Bound Update
To figure out which method is the best for updating y ∗ we performed tests
for the Griewank function [16] with n = 8 on the domain [x0 ] = [−200, 220]
in each direction. The target interval width was set to X = 10−13 . In Fig. 2
we compare the convergence of the minimum bound for the proposed update
methods. Evaluating the function at the midpoint of the domain improves the
branch and bound compared to using the upper bounds of the interval values.
The (incomplete) local search of a minimum implemented by a gradient descent
can decrease y ∗ even faster although it has higher computational costs due to
the computation of the gradient for every task. We observe that the gradient
descent method with 16 steps converges faster in the beginning, but it loses this
advance after some time due to the high computational effort. Thus, choosing
the number of decent steps to be dependent on the width (brown) of the domain
requires an unchanged number of tasks while the run time decreases. Computing
even more gradient descent steps (green) reduces the number of computed tasks
and with that the run time to less than a half.
4.3 Interval Derivative Conditions

The third tests are performed to evaluate how useful interval derivatives are in
the context of branch and bound algorithms. Therefore, we compute the global
optimum of the Griewank, the Rosenbrock [17] and the Styblinski-Tang [18]
function with the specification of the algorithm as given in Sect. 4.2. For updat-
ing the bound of the minimum 16 gradient descent steps are computed. Since
the adjoint models are only evaluated if the value check passes the average time
per task is strongly dependent on the problem. If the value check fails for almost
every task as it is the case for the Griewank function the additional average
cost per task is very low as stated in Table 1. For the other two objectives the
additional average cost per task is still less than 50%. Since there are only a few
cases left in which the interval Hessian can be computed the average cost per
task is just slightly increasing compared to the interval gradient computation.
The tests without interval derivatives were aborted after 30 minutes such that
it is impossible to quantify the achieved savings for using the interval gradient.
Nevertheless, all tests using interval derivative information found the global min-
ima in less than two minutes. More than 30% of the tasks were eliminated by
the gradient check for the Styblinski-Tang function such that the impact of the
derivative information on that function is larger than on the other two functions.
The Hessian condition was violated in a few cases only.
Table 1. Additional average costs per task for computing interval derivative infor-
mation if required (left) and relative amount of tasks failing the particular conditions
(right) for the Griewank (GW), Rosenbrock (RB) and Styblinski-Tang (ST) function.
Additional average cost per task Failing condition

GW RB ST GW RB ST
Value - - - 99.1% 89.7% 68.9%
Gradient 4.0% 29.5% 42.0% 0.5% 10.0% 30.7%
Hessian 3.6% 3.0% 11.8% 0.0% <0.01% <0.01%
5 Conclusion and Outlook

We are aware that the proposed branch and bound algorithm can be advanced
into several directions. Nevertheless, we can state that it is important to find a
good bound of the minimum early to eliminate subdomains. The local gradient
descent procedure improved the convergence of the algorithm. Reducing the

number of descent steps over the domain size also decreases the run time. As
future work, the local search could be upgraded to a (quasi-)Newton method.
Moreover, finding a local optimum as a first estimate of the global minima in a
preprocessing phase should be useful.
The computation of the interval gradient is highly recommended since their
information enables the global search in reasonable time. Plenty of subdomains
violate the necessary condition and can be eliminated. Whether the validation of
the sufficient condition is useful depends on the objective. Since most tasks are
already eliminated due to the other conditions the overhead is not too large. In
our tests we only evaluated the Hessian condition in a single random direction,
computing the whole interval Hessian could improve the benefit.
Changing the splitting approach by only split in some directions might also
improve convergence. Good approaches for deciding which directions should be
split and which tasks should be computed next are required to compute objec-
tive functions with more than ten arguments in acceptable time. Increasing the
number of arguments will not have an impact to the adjoint computations. Fur-
ther future work is to to compute adjoints of convex relaxations which might
yield tighter interval bounds.
References
1. Griewank, A., Walther, A.: Evaluating Derivatives: Principles and Techniques of
Algorithmic Differentiation. 2nd edn. SIAM, Philadelphia, PA (2008)
2. Naumann, U.: The Art of Differentiating Computer Programs: An Introduction to
Algorithmic Differentiation. SIAM, Philadelphia (2012)
3. Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differ-
entiation in machine learning: a survey. J. Mach. Learn. Res. 18(1), 5595–5637
(2017)
4. Giles, M., Glasserman, P.: Smoking adjoints: fast monte carlo greeks. Risk 19(1),
88–92 (2006)
5. Towara, M., Naumann, U.: SIMPLE adjoint message passing. Optim. Methods
Softw. 33(4–6), 1232–1249 (2018)
6. Moore, R.E.: Methods and Applications of Interval Analysis, 2nd edn. SIAM,
Philadelphia (1979)
7. Moore, R.E., Kearfott, R.B., Cloud, M.J.: Introduction to Interval Analysis. SIAM,
Philadelphia (2009)
8. Hansen, E., Walster, G.W.: Global Optimization using Interval Analysis. Marcel
Dekker, New York (2004)
9. Neumaier, A.: Complete search in continuous global optimization and constraint
satisfaction. Acta Numer. 13, 271–369 (2004)
10. Floudas, C.A., Pardalos, P.M.: Encyclopedia of Optimization, 2nd edn. Springer,
New York (2009)
11. Vassiliadis, V., Riehme, J., Deussen, J., Parasyris, K., Antonopoulos, C.D., Bellas,
N., Lalis, S., Naumann, U.: Towards automatic significance analysis for approx-
imate computing. In: Proceedings of CGO 2016, pp. 182–193. ACM, New York,
(2016)
12. Hascoët, L., Naumann, U., Pascual, V.: “To be recorded” analysis in reverse-mode
automatic differentiation. FGCS 21(8), 1401–1417 (2005)
13. Naumann, U., Lotz, J., Leppkes, K., Towara, M.: Algorithmic differentiation of
numerical methods: tangent and adjoint solvers for parameterized systems of non-
linear equations. ACM Trans. Math. Softw. 41(4), 26:1–26:21 (2015)
14. Brönnimann, H., Melquiond, G., Pion, S.: The design of the Boost interval arith-
metic library. Theor. Comput. Sci. 351(1), 111–118 (2006)
15. Dixon, L.C.W., Szegö, G.P.: The global optimization problem: an introduction. In:
Towards Global Optimization, vol. 2, pp. 1–15. North-Holland, Amsterdam (1978)
16. Griewank, A.: Generalized decent for global optimization. J. Optim Theory Appl.
34(1), 11–39 (1981)
17. Rosenbrock, H.H.: An automatic method for finding the greatest or least value of
a function. Comput. J. 3(3), 175–184 (1960)
18. Styblinski, M.A., Tang, T.S.: Experiments in nonconvex optimization: stochastic
approximation with function smoothing and simulated annealing. Neural Netw.
3(4), 467–483 (1990)
Diving for Sparse Partially-Reflexive
Generalized Inverses
Victor K. Fuentes1 , Marcia Fampa2 , and Jon Lee1(B)

1
University of Michigan, Ann Arbor, MI, USA
{vicfuen,jonxlee}@umich.edu
2
Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brasil
fampa@cos.ufrj.br
Abstract. Generalized inverses form a set of key tools in matrix algebra.

For large-scale applications, sparsity is highly desirable, and so sparse
generalized inverses have been studied. One such family is based on relax-
ing the well-known Moore-Penrose properties. One of those properties is
non-linear, and so we develop a convex-programming relaxation and an
associated “diving” heuristic to achieve a good trade-off between sparsity
and satisfaction of the non-linear Moore-Penrose property.
Keywords: Generalized inverse · Reflexive generalized inverse ·

Moore-Penrose pseudoinverse · Sparse optimization
1 Introduction
For a real matrix A ∈ Rm×n , we consider generalized inverses of A (see [15]) to

solve fundamental problems such as the least-squares problem. The M-P (Moore-
Penrose) pseudoinverse, independently discovered by E.H. Moore and R. Pen-
rose, is the most well-known generalized inverse. If A = U ΣV is the real singular
value decomposition of A (see [13], for example), then we define the M-P pseu-
doinverse of A as A+ := V Σ + U , where Σ + has the shape of the transpose of the
diagonal matrix Σ, and is derived from Σ by taking reciprocals of the non-zero
(diagonal) elements of Σ (i.e., the non-zero singular values of A).
Various sparse generalized inverses have been defined, based on the the fol-
lowing fundamental characterization of the M-P pseudoinverse.
Theorem 1. ([14]). For A ∈ Rm×n , the M-P pseudoinverse A+ is the unique

H ∈ Rn×m satisfying:
AHA = A (P1)
HAH = H (P2)
M. Fampa was supported in part by CNPq grant 303898/2016-0. J. Lee was supported
in part by ONR grant N00014-17-1-2296
https://doi.org/10.1007/978-3-030-21803-4_9
90 V. K. Fuentes et al.
(AH) = AH (P3)
(HA) = HA (P4)
A generalized inverse is defined as any H satisfying P1. We seek sparse gener-

alized inverses, for use in efficiently finding approximate solutions to key problem.
Note that without P1, the all-zero matrix satisfies the other M-P properties, and
so we are only interested in generalized inverse.
Not all of the M-P properties are needed for a generalized inverse to exactly
solve some key problems:
Proposition 2. (see [11]). If H satisfies P1 and P3, then x := Hb (and of

course A+ b) solves min{Ax − b2 : x ∈ Rn }.
Proposition 3. (see [11]). If H satisfies P1 and P4, and b is in the column

space of A, then Hb (and of course A+ b) solves min{x2 : Ax = b, x ∈ Rn }.
Note that with regard to how a generalized inverse H is used, we are moti-
vated by the situation in which A is very large (and hence so is H), and we have
many right-hand sides b for which we wish to form Hb. Clearly, a sparse H has
computational advantages for this use case.
Except for P2, the other M-P properties are linear (in H). Therefore, mini-
mizing H1 over any subset of the M-P properties including P1 and excluding
P2 yields four different sparse generalized inverses which can be computed by
linear optimization; see [11].
Following [16], we call any generalized inverse satisfying P2 a reflexive gen-
eralized inverse. P2 is strongly connected with the rank of a generalized inverse:
Theorem 4. ([16], Theorem 3.14). If H is a generalized inverse of A, then

rank(H) ≥ rank(A). Moreover, a generalized inverse H of A is a reflexive gen-
eralized inverse if and only if rank(H) = rank(A).
Therefore, we can use approximate satisfaction of P2 (on top of P1) as a

proxy for seeking a low-rank generalized inverse. Low rank is a very desirable
property—for example, for the least squares problem, it corresponds to “explain-
able” models.
Outline: In Sect. 2, we present our main convex relaxation model for reflexive
generalized inverses. In Sect. 3, we present our diving heuristic, which seeks to
trace a path from a sparse generalized inverse to the M-P pseudoinverse. The
goal is to find good solutions on such a path. In Sect. 4, we briefly present some
“proof-of-concept” computational results.
Notation: We use vector-norm notation on matrices; so we write H1 to
mean vec(H)1 . We use I for an identity matrix and J for a square all-ones
matrix, sometimes indicating the order with a subscript. Matrix dot product is
indicated by ·, ·.
Literature review: [5] introduced the idea of sparse left and right pseudoin-
verses, further developing the ideas in [6,7]. Sparse generalized inverses based
Diving for Sparse Partially-Reflexive Generalized Inverses 91
on the M-P properties were introduced in [11] and developed further in [10,17].
The modeling ideas presented in Sect. 2 are extended to general bilinear forms in
[9]. The work in this paper will also appear as part of the forthcoming doctoral
dissertation of V.K. Fuentes.
2 Relaxing P2
It is useful to have an explicit formulation of min {H1 : P1 } as a linear-

optimization problem:
min J, T (P)
T −H ≥0 ,
T +H ≥0 ,
AHA = A .
On top of this formulation, we can easily impose any of P3 and P4, which are
obviously linear in H. The challenge is to find a way to work with P2. Really, we
do not consider fully imposing P2—in fact, if we had already imposed P3 and
P4, we would simply end up with the M-P pseudoinverse, which is likely to be
fully dense. Rather, we want to impose a relaxation of P2, so as to give ourselves
enough room to find a sparse solution.
First, we write P2 as the following nm non-symmetric quadratic equations
hi· Ah·j = hij ,
which we can also see as

A, hi· h·j = hij ,
for all ij ∈ n × m. Next, we lift to non-symmetric matrix space, defining the
matrix variables
Kij := hi· h·j ∈ Rm×n ,
for all ij ∈ n × m. We note that lift to non-symmetric matrix space is not a
standard device.
Property P2 can then be modeled by the linear equations
A, Kij = hij , (1)
together with the non-convex equations
Kij − hi· h·j = 0m×n , (2)
for all ij ∈ n × m.
Let uij ∈ Rm and v ij ∈ Rn be (for now) arbitrary vectors, for all ij ∈ n × m.

Considering this notation, we have the valid non-linear equations

uij Kij − hi· h·j v ij = 0. (3)
Next, we induce separability by letting

t1ij := (uij hi· + v ij h·j )/2,

t2ij := (uij hi· − v ij h·j )/2,
and we arrive at the relaxation

Kij , uij v ij + w1ij + t22ij ≤ 0,

− Kij , uij v ij + t21ij + w2ij ≤ 0,
where the concave terms −t2pij have been replaced with the linear terms +wpij ,
for p = 1, 2. Assuming lower and upper bounds on tpij (αpij ≤ tpij ≤ βpij ), the
new variables wpij are then constrained to satisfy the secant inequalities

2
βpij − αpij
2
− (tpij − αpij ) + αpij ≤ wpij .
2
βpij − αpij
We assume that we can impose reasonable interval bounds on the hij , say
λij ≤ hij ≤ μij , (4)
for ij ∈ n × m. Then interval bounds [αpij , βpij ] on tpij can be directly derived:
m
1

n
α1ij = min{(uij ) λi , (uij ) μi } + min{(v ij ) λj , (v ij ) μj } ,
2
=1 =1
m
1

n

β1ij = max{(u ) λi , (u ) μi } +
ij ij
max{(v ) λj , (v ) μj } ,
ij ij
2
=1 =1
m
1

n
α2ij = min{(u ) λi , (u ) μi } −
ij ij
max{(v ) λj , (v ) μj } ,
ij ij
2
=1 =1
m
1

n
β2ij = max{(u ) λi , (u ) μi } −
ij ij
min{(v ) λj , (v ) μj } .
ij ij
2
=1 =1
Though, we could also seek to tighten these bounds by casting and solving
appropriate optimization problems.
The quadratic model derived for our problem follows.
QuadraticM odel (P)

min J, T
Linear equations:
T −H ≥0 ,
T +H ≥0 ,
AHA = A P1, or the lighter version: (ΣV ) H (U Σ) = Σ ,
(AH) = AH P3 (optional) ,
(HA) = HA P4 (optional) ,
A, Kij = hij , ∀ij ∈ n × m .
Quadratic lifting inequalities, for various choices of uij and v ij :
(note that the αpij and βpij depend on uij and v ij )

t1ij := (uij hi· + v ij h·j )/2, [substitute below] ∀ij ∈ n × m ,
ij
t2ij := (uij hi· − v h·j )/2, [substitute below] ∀ij ∈ n × m ,

ij ij
Kij , u v + w1ij + t22ij ≤ 0 , [convex quadratic] ∀ij ∈ n × m ,

− Kij , uij v ij + t21ij + w2ij ≤ 0 , [convex quadratic] ∀ij ∈ n × m ,

2
βpij − αpij
2
− (tpij − αpij ) + αpij ≤ wpij ,
2
βpij − αpij
[secant] for p = 1, 2 , ∀ij ∈ n × m ,
αpij ≤ tpij ≤ βpij , for p = 1, 2 , ∀ij ∈ n × m .
Additionally, we could replace the convex quadratic terms +t2pij with lower-
bounding linearizations. That is, we can replace +t2pij with
2
ηpij + 2ηpij (tpij − ηpij ),
at one or more values ηpij ∈ [αpij , βpij ] in the interval domain of tpij . More
specifically, we substitute as follows:

+t21ij ← ηpij
2
+ 2η1ij (uij hi· + v ij h·j )/2 − η1ij ,
and

+t22ij ← η2ij
2
+ 2η2ij (uij hi· − v ij h·j )/2 − η2ij .
In this manner, we could choose to work with a linear rather than quadratic
model.
Let (Ĥ, K̂) denote the solution of P. Appropriate vectors uij and v ij can be
obtained from the columns of the matrices U ij and V ij in the SVD:

U ij (K̂ij − ĥi· ĥ·j )V ij = Σ ij .
Though it might be beneficial to pre-compute some of these vectors, before
finding cuts iteratively via SVD.
3 Diving
The inequalities that we have been considering for relaxing P2 are rather heavy,
and it is not practical to include a large number of them. Moreover, it may not be
desirable to even implicitly include all of them. The inequalities relax P2, but we
do not want to fully enforce P2. Instead, we understand that there is a trade-off
to be made between sparsity, as measured by H1 , and satisfaction of P2, as
measured say by H − HAHF . In this section, we propose a “diving” procedure
for progressively enforcing P2 while heuristically narrowing the domain of our
feasible region.
Diving is well known as a key primal heuristic for mixed-integer linear opti-
mization, in the context of branch-and-bound; see, for example, [1–4,8]. Part of
its popularity stems from the fact that it is easy to implement within a mixed-
integer linear-optimization solver that already has the infrastructure to carry out
branch-and-bound. Iteratively, via a sequence of continuous relaxations, vari-
ables that are required to be integer in feasible solutions are heuristically fixed
to integer values. This is a bit akin to “reduced-cost fixing” (for mixed-integer
linear-optimization), where variables are fixed to bounds in a provably correct
manner. Diving heuristics employ special (heuristic) branching rules, with the
aim of tending towards (primal) feasibility and not towards a balanced sub-
division of the problem (as many branching rules seek to do). These heuristics
“quickly go down” the branch-and-bound tree (in the sense of depth-first search),
giving us the term diving. The heuristic is so important in the context of mixed-
integer linear-optimization solvers that most of them, as a default, do a sequence
of dives at the beginning of the solution process, so as to quickly obtain a good
feasible solution (which is very important for limiting the branching exploration).
Applying this type of idea in continuous non-convex global optimization appears
to be a fairly recent idea; see [12].
Our diving heuristic is closely related to this idea, but there is an important
difference. Diving in the context of global optimization is aimed at hoping to get
lucky and branch directly toward what will turn out to be a globally-optimal
solution. Our context is different; our “target” that we aim toward is the MP-
pseudoinverse A+ . But, importantly, our goal is not to get there; rather, our goal
is find good solutions along the way, that trade off sparsity against satisfaction
of P2.
We consider a diving procedure that iteratively increases the enforcement
of property P2, while heuristically localizing our search, inevitably showing its
impact on the sparsity (approximately measured by H1 ) of a computed gen-
eralized inverse H.
– The procedure is initialized with the solution of problem P, but without the
quadratic lifting inequalities.
– We define bounds for hij (λij ≤ hij ≤ μij ), such that [λij , μij ] is the smallest
interval that contains ĥij and A+
ij . By including the current Ĥ in the box,
we hope to remain localized to a region where there is a somewhat-sparse
solution. By including the MP-pseudoinverse A+ in the box, we guaranteed
that at every step we will have a feasible solution to our domain-restricted
relaxation.
– Next, for a fixed number of iterations, we consider the last solution (Ĥ, K̂)
of P, and we append to P the following inequalities, for all ij such that
K̂ij − ĥi· ĥ·j
= 0m×n .

Kij , uij v ij + w1ij + t22ij ≤ 0,

2
β1ij − α1ij
2
− (t1ij − α1ij ) 2
+ α1ij ≤ w1ij ,
β1ij − α1ij
α1ij ≤ t1ij ≤ β1ij ,
where uij ∈ Rm and v ij ∈ Rn are respectively, left- and right-singular vectors

of K̂ij − ĥi· ĥ·j , corresponding to its largest singular value. This amounts
to iteratively tightening violated non-convex quadratic equations via secant
inequalities.
– Finally, we execute our “diving procedure”, where at each iteration, we select
ij ∈ n × m, and cut the interval [α1ij , β1ij ] where a variable t1ij varies into
two parts. Between the two, the new interval is selected to contain A+ ij . The
branching point can be, for example, the midpoint of the interval, the current
value of ĥij , or a weighted combination of both.
– We select ij at each iteration corresponding to the non-convex inequality

uij (K̂ij − ĥi· ĥ·j )v ij ≤ 0,
that is most violated by the last solution computed for P.

– We note that by reducing the size of the interval [α1ij , β1ij ], we reduce the size
of the interval where the secant of −t21ij is defined, leading to a new secant
that better approximates the concave function, and therefore, strengthening
the relaxation of P2 on the new interval.
– The stopping criterion for the diving procedure is a given maximum violation
for P2, i.e., the algorithm stops when ĤAĤ − ĤF ≤ .
4 Preliminary Experiments
We present results on an example, as a proof of concept. More detailed compu-
tational results will appear in a subsequent publication. We imposed all of P1,
P3 and P4. Figure 1 contains two plots. The first shows the increase in H1 as
we dive, by iteration. The second shows the decrease in P2 violation as we dive.
Fig. 1. Illustrative results for a randomly generated 10 × 5 matrix A.
Fig. 2. Tradeoff between ||H||1 and violation of P2

Figure 2 indicates the solutions achieved, trading off H1 and P2 violation.

The first plot in the figure experiments with different “branching” points, based
on weightings of A+ij and the current value of ĥij . We can see that the weighting
25/75 gives the best solutions. The second plot of the figure uses the interval
midpoint instead of A+ ij , and we see similar results.
Fig. 3. Tradeoff
Figure 3 shows the results, as a scatter plot, for the best weighting, 25/75.
The difference between the two in the plot is only the selection of “branching”
point. We can see that there is no significant difference.
Overall, we see that our diving heuristic appears to be an effective means for
trading off sparsity against P2 satisfaction.
References
1. Achterberg, T.: Constraint integer programming. Ph.D. thesis, Berlin Institute of
Technology (2007). http://opus.kobv.de/tuberlin/volltexte/2007/1611/
2. Berthold, T.: Primal Heuristics for Mixed Integer Programming. Master’s thesis,
Technische Universität Berlin (2006)
3. Berthold, T.: Heuristics of the branch-cut-and-price-framework scip. In: Kalcsics,
J., Nickel, S. (eds.) Operations Research Proceedings 2007, pp. 31–36. Springer,
Berlin (2008)
4. Danna, E., Rothberg, E., Le Pape, C.: Exploring relaxation induced neighborhoods
to improve mip solutions. Math. Progr. Ser. A 102, 71–90 (2005). https://doi.org/
10.1007/s10107-004-0518-7
5. Dokmanić, I., Kolundžija, M., Vetterli, M.: Beyond Moore-Penrose: Sparse pseu-
doinverse. In: ICASSP, vol. 2013, pp. 6526–6530 (2013)
6. Dokmanić, I., Gribonval, R.: Beyond Moore-Penrose Part I: Generalized Inverses
that Minimize Matrix Norms (2017). https://hal.inria.fr/hal-01547283
7. Dokmanić, I., Gribonval, R.: Beyond Moore-Penrose Part II: The Sparse Pseudoin-
verse (2017). https://hal.inria.fr/hal-01547283
8. Eckstein, J., Nediak, M.: Pivot, cut, and dive: a heuristic for 0–1 mixed integer
programming. J. Heuristics 13, 471–503 (2007)
9. Fampa, M., Lee, J.: Efficient treatment of bilinear forms in global optimization
(2018). arXiv:1803.07625
10. Fampa, M., Lee, J.: On sparse reflexive generalized inverse. Oper. Res. Lett. 46(6),
605–610 (2018)
11. Fuentes, V., Fampa, M., Lee, J.: Sparse pseudoinverses via LP and SDP relaxations
of Moore-Penrose. CLAIO 2016, 343–350 (2016)
12. Gerard, D., Köppe, M., Louveaux, Q.: Guided dive for the spatial branch-and-
bound. J. Glob. Optim. 68(4), 685–711 (2017)
13. Golub, G., Van Loan, C.: Matrix Computations, 3rd edn. Johns Hopkins University
Press, Baltimore (1996)
14. Penrose, R.: A generalized inverse for matrices. Proc. Camb. Philos. Soc. 51, 406–
413 (1955)
15. Rao, C., Mitra, S.: Generalized Inverse of Matrices and Its Applications. Probabil-
ity and Statistics Series. Wiley (1971)
16. Rohde, C.: Contributions to the theory, computation and application of
generalized inverses. Ph.D. thesis, University of North Carolina, Raleigh,
N.C. (May 1964). https://www.stat.ncsu.edu/information/library/mimeo.archive/
ISMS 1964 392.pdf
17. Xu, L., Fampa, M., Lee, J.: Aspects of symmetry for sparse reflexive generalized
inverses (2019)
Filtering Domains of Factorable Functions
Using Interval Contractors
Laurent Granvilliers(B)
LS2N, Université de Nantes, Nantes, France

laurent.granvilliers@univ-nantes.fr
Abstract. Many theorems in mathematics require a real function to

be continuous over the domain under consideration. In particular the
Brouwer fixed point theorem and the mean value theorem underlie many
interval methods like Newton operators for solving numerical constraint
satisfaction problems or global optimization problems. Since the conti-
nuity property collapses when the function is not defined at some point
it is then important to check whether the function is defined everywhere
in a given domain. We introduce here an interval branch-and-contract
algorithm that rigorously approximate the domain of definition of a fac-
torable function within a box. The proposed approach mainly relies on
interval contractors applied to the domain constraints and their nega-
tions stemming from the expression of the function.
Keywords: Interval methods · Branch-and-contract algorithm ·

Interval contractor · Constraint satisfaction problem · Paving
1 Introduction
A real function f : D → Rm with D ⊆ Rn is factorable if it can be defined as

a finite recursive composition of arithmetic operations and elementary functions
simply called operations thereafter. Given a box Ω ⊆ Rn we study the problem
of approximating the intersection Ω ∩ D with interval computations. Our goal
is to calculate a paving (X i , X o ) where X i is a union of inner boxes and X o is a
union of outer boxes such that
X i ⊆ Ω ∩ D ⊆ X i ∪ X o ⊆ Ω.
Figure 1 shows such a paving computed at a given precision > 0. We see

that the outer boxes are accumulated on the frontier of the set Ω ∩ D, the width
of each one defined componentwise being smaller than .
The problem described above can be defined as a numerical constraint sat-
isfaction problem (CSP) C, Ω where C is a set of constraints such that a point
x ∈ Ω belongs to D if and only if it satisfies all the constraints from C. It turns
out that every operation of f having a restricted domain entails a constraint

https://doi.org/10.1007/978-3-030-21803-4_10
100 L. Granvilliers
(a) (b) (c)

√
Fig. 1. Let f (x1 , x2 ) = ( x1 x2 + 1/(x21 − x22 ), log(16 − x21 − x22 )) be a real function.
Figure (a) shows a paving of its domain of definition at precision = 0.1 given Ω = R2
composed of outer boxes depicted in Fig. (b) and inner boxes depicted in Fig. (c).
that must be inserted in C. For example, the function whose paving is depicted
in Fig. 1 leads to the set
C = {x1 x2 + 1 ≥ 0, x21 − x22

= 0, 16 − x21 − x22 > 0}.
We see that the frontier of its domain of definition is delimited by an hyperbola,

a circle and lines represented by the domain constraints.
A paving of the solution set of a numerical CSP can be computed by an
interval branch-and-contract algorithm that recursively divides and contracts
the initial box until reaching the desired precision, following a contractor pro-
gramming approach [5]. In this framework, an interval contractor is associated
with one or many constraints to tighten a box by removing inconsistent val-
ues from its bounds, using different techniques such as consistency techniques
adapted to numerical computations with intervals [11] or interval Newton oper-
ators [10]. Several contractors can be applied in a fixed-point loop known as
constraint propagation [12]. Finding inner boxes can be done by considering the
negations of the constraints [2] or by means of inflation techniques [4,6].
In the following, we introduce an interval branch-and-contract algorithm that
calculates a paving of the domain of definition of a real function within a given
box. A set of rules is proposed to derive the system of domain constraints entailed
by the expression of the function. We adapt the HC4Revise contractor [3] in
order to process these specific constraints. Finally, a new heuristic for generating
maximal inner boxes is devised. A set of experiments permit to evaluate the
quality of computed pavings.
The rest of this paper is organized as follows. Interval arithmetic and the
notion of interval contractor will be introduced in Sect. 2. The new algorithms
will be described in Sect. 3. Section 4 is devoted to the experimental results,
followed by a conclusion.
Filtering Domains of Factorable Functions Using Interval Contractors 101
2 Interval Computations
2.1 Interval Arithmetic
An interval is a closed and connected set of real numbers. The set of intervals
is denoted by I. The empty interval represents an empty set of real numbers.
The width of an interval [a, b] is equal to (b − a). The interval hull of a set of
real numbers S is the interval [inf S, sup S] denoted by hull S. Given an integer
n ≥ 1, an n-dimensional box X is a Cartesian product of intervals X1 × · · · × Xn .
A box is empty if one of its components is empty. The width of a box X is the
maximum width taken componentwise denoted by wid X.
Interval arithmetic is a set extension of real arithmetic [13]. Let g : D → R
be a real function with D ⊆ Rn . An interval extension of g is an interval function
G : In → I such that
(∀X ∈ In ) (∀x ∈ X ∩ D) g(x) ∈ G(X).
This property called the fundamental theorem of interval arithmetic implies that
the interval G(X) encloses the range of g over X. When g corresponds to a
basic operation, it is possible to implement the interval operation in a way to
calculate the hull of the range by exploiting monotonicity properties, limits and
extrema. More complex functions can be extended in several ways. In particular,
the natural interval extension of a factorable function consists of evaluating the
function with interval operations given interval arguments.
2.2 Interval Contractors

Given a vector of unknowns x ∈ Rn , an interval contractor associated with a
constraint c(x) is an operator Γ : In → In verifying the following properties:

Γ (X) ⊇ {x ∈ X : c(x)} (consistency)
(∀X ∈ In )
Γ (X) ⊆ X (contractance)
An interval contractor aims at removing inconsistent values at the bounds of the

variable domains. There are many kinds of contractors and we present here the
forward backward contraction algorithm called HC4Revise [3]. Given an equation
g(x) = 0 or an inequality constraint g(x) ≤ 0 and a box X, the first phase is
an evaluation of the natural extension of g from the leaves to the root. We then
consider the interval I associated with the relation symbol, namely [0, 0] for an
equation and [−∞, 0] for an inequality. There are three cases: if the intersection
G(X) ∩ I is empty then the constraint is inconsistent; if we have the inclusion
G(X) ⊆ I then the constraint is consistent and X is an inner box for this
constraint said to be inactive; otherwise the second phase calculates projections
from the root of g lying in G(X) ∩ I to the leaves, eventually contracting the
variable domains. An example is presented in Fig. 2.
As previously shown, an HC4Revise contractor is able to detect that a box is
an inner box after the first phase. Now it is possible to apply it to the negation
102 L. Granvilliers
[−4, 0] + [−4, 46]

u
[0, 4] × [0, 20] [−4, 0] − [−4, 26]

v w
2 [0, 2] x1 [0, 10] [0, 4] sqr [0, 25] [0, 4] x3 [−1, 4]
[−2, 2] x2 [−5, 5]
Fig. 2. Let g(x) ≤ 0 be an inequality constraint with g(x1 , x2 , x3 ) = 2x1 + x22 − x3 and
let X be the box [0, 10] × [−5, 5] × [−1, 4]. The interval on the right at each node of g is
the result of the interval evaluation phase of the HC4Revise contractor. The interval at
the root node is intersected with the interval I = [−∞, 0] associated with the relation
symbol. The interval on the left at each node is the result of the projection phase
from the root to the leaves. For example, let u ∈ [−4, 0], v ∈ [0, 20] and w ∈ [−4, 26]
be three variables respectively labelling the + node, the × node and the − node. We
have to project the equation v + w = u over v and w, which propagates the new
domain at the root node to its children nodes. To this end the equation is inverted
and it is equivalently rewritten as v = u − w. The new domain for v is calculated as
[0, 20] ∩ ([−4, 0] − [−4, 26]), which leads to the new domain [0, 4] at the × node. The
new domain for w is derived similarly. At the end of this backward phase it comes the
new box [0, 2] × [−2, 2] × [0, 4].
of a constraint in order to generate inner boxes inside a box, as follows. Given

an inequality constraint g(x) ≤ 0, let Γ be an HC4Revise contractor associated
with its negation g(x) > 0. Given a box X, it comes by the consistency property
of Γ that every element of the region X \ Γ (X) violates the constraint negation,
hence satisfying the constraint itself. When this region is non empty, it is possible
to generate inner boxes for the constraint, as shown in Fig. 3. Since the rounding
errors of machine computations prevent in general to deal with open intervals,
the constraint negation is safely relaxed as g(x) ≥ 0.
3 Filtering Domains of Functions

3.1 Domain Constraints
Several operations have restricted domains such as the division x → x−1 defined
in R \ {0}, the square root defined in R+ , the logarithm defined in (0, +∞),
the arccosine and arcsine functions defined in [−1, 1] and the tangent function
defined at every point that is not a multiple of π/2. A factorable function whose
definition involves one of these operations may not be defined everywhere in
a box, and, a fortiori, it may not be continuous. It naturally yields domain
constraints that must be verified, as illustrated by Fig. 4.
Fig. 3. Let c be the inequality constraint x21 + x22 ≤ 4 that defines a disk in the
Euclidean plane and let X be the box [0, 4] × [−1, 1]. The hatched surface is returned
by an HC4Revise contractor Γ associated with the negation of c. The gray region
X \ Γ (X) is thus an inner region for c (every point satisfies c) and it is a box here.
Every term op(u1 , . . . , uk ) occurring in a factorable real function f : D ⊆

Rn → Rm such that the domain of op is a strict subset of Rk entails a con-
straint. A constraint system C can then be generated from f using the following
(non exhaustive) set of rules. There are different kinds of constraints such as
(strict and non-strict) inequality constraints and disequations. The algorithms
introduced thereafter will also consider their negations.
⎧√
⎪
⎪ u |= u ≥ 0
⎪
⎪
⎪
⎪ log u |= u > 0
⎨
1/u |= u
= 0
⎪
⎪ acos u |= −1 ≤ u ≤ 1
⎪
⎪
⎪
⎪ asin u |= −1 ≤ u ≤ 1
⎩
tan u |= u
= π/2 + kπ (k ∈ Z)
Given a box Ω ⊆ Rn , every x ∈ Ω satisfying all the constraints from C must
belong to D. Finding the set Ω ∩ D is then equivalent to solve the numerical
CSP C, Ω. It is worth noting that the set C may be separable. For example,
the function f (x1 , x2 ) = log(x1 ) + x−1
2 entails two constraints x1 > 0 and x2
= 0
sharing no variable and thus handable separately.
√
Fig. 4. The function f (x) = x2 − x is undefined in the open interval (0, 1) since the
square root is defined in R+ and g(x) = x2 − x is negative for all x such that 0 < x < 1.
The restricted domain of the square root entails the constraint x2 − x ≥ 0.
104 L. Granvilliers
3.2 Branch-and-Contract-Algorithm
Algorithm 1 implements a classical interval branch-and-contract algorithm that
calculates a paving of the domain of definition of a function f within a given box
Ω. It maintains a list L from which a CSP C, X is extracted at each iteration.
This CSP is reduced and divided by two algorithms contract and branch that
are specific to our problem. If the set C becomes empty then X is inserted in
the set of inner boxes X i . A tolerance > 0 permits to stop processing too small
boxes that are inserted in the set of outer boxes X o .
Algorithm 1. Branch-and-contract algorithm.

Input: – a function f : D → Rm with D ⊆ Rn
– a box Ω ⊆ Rn
– a tolerance > 0
Output: – a paving (X i , X o ) of Ω ∩ D at tolerance
Algorithm:
generate the set of domain constraints C from f
initialize L with the CSP
C, Ω
assign (X i , X o ) to the empty paving
while L is not empty do
extract an element
C, X from L
contract
C, X
if X = ∅ then
if C = ∅ then insert X in X i
elif wid X ≤ then insert X in X o
else branch
C, X
endif
endif
endwhile
Given a CSP C, X a contractor is associated with each constraint from

the set C. The contract component classically implements a constraint prop-
agation algorithm that applies the contractors to reduce X until reaching a
fixed-point. Moreover, every constraint detected as inactive is removed from C.
The HC4Revise contractor has been designed to handle non-strict inequality
constraints and equations since it is not possible to manage open intervals in
general due to the rounding errors. The more specific domain constraints are
handled as follows. Let g(x) be a real function, let G be the natural interval
extension of g and let X be a box.
– A strict inequality constraint g(x) > 0 is safely relaxed as g(x) ≥ 0 since
every point that violates the relaxation also violates the constraint.
– A disequation g(x)
= 0 is violated if the interval G(X) is reduced to 0, it is
inactive if we have max G(X) < 0 or min G(X) > 0, and nothing happens in
the backward phase otherwise.
– A double inequality constraint a ≤ g(x) ≤ b is simply handled by setting the

interval G(X) ∩ [a, b] at the root node after the first phase.
– The periodic domain constraint g(x)
= π/2 + kπ for some integer k does not
permit to contract X. The constraint is simply detected as inactive if G(X)
does not contain π/2 + kπ for every k, and nothing happens otherwise.
The branch algorithm divides a CSP C, X into sub-problems. Let C be the
set of constraints {c1 , . . . , cp }. A contractor Γi is associated with the negation of
ci for i = 1, . . . , p. Each contractor is applied to X and it follows that the region
p

X\ Γi (X)
i=1
is an inner region for the CSP, which means that every point of this region
satisfies all the constraints from C, as illustrated in Fig. 5.
X X1 X2
Γ2 (X)
Γ1 (X)
Γ3 (X)
(a) (b)
Fig. 5. A box X is contracted by three contractors Γ1 , Γ2 , Γ3 associated with constraint

negations, leading to the hatched boxes in Fig. (a). The complementary gray region is
an inner region for the original constraints. Figure (b) shows that X can be split as
two boxes X 1 ∪ X 2 where X 1 is the largest inner slice at one bound of X.
We then define the branching heuristic as follows. Let the box

p

H = hull Γi (X)
i=1
be the interval hull of the contracted boxes with respect to the constraint nega-
tions. If H is empty then X is an inner box and it is inserted in X i . Now suppose
that H is not empty. Let d− i = min Hi − min Xi and di = max Xi − max Hi be
+
the inter-bound distances between X and H for i = 1, . . . , n. Let
d = max{d− − +
1 , . . . , dn , d1 , . . . , dn }
+
be the maximum inter-bound distance. If d is greater than the tolerance then

there exists an inner box at one bound of X that is large enough. Assuming for
106 L. Granvilliers
instance that d = d− i o
j for some j, X is split in two sub-boxes X ∪ X at xj = dj .
−
The maximal inner box X i is directly inserted in the set of inner boxes X i and
the CSP C, X o is inserted in L. Otherwise, a bisection of the largest component
of X generates two sub-boxes X ∪ X and the CSPs C, X and C, X are
added to L, which ensures the convergence of the branch-and-contract algorithm.
4 Experimental Results
The interval branch-and-contract algorithm has been developed in the interval

solver Realpaver [9]. The interval operations are safely implemented with an out-
ward rounding mode, the MPFR library [7] providing the elementary functions
with correct rounding. As a consequence, the interval computations in Realpaver
are rigorous. All experiments were conducted on a 64 bits Intel Core i7 4910MQ
2.90 GHz processor.
Three strategies will be compared in the following: S3 corresponds to Algo-
rithm 1, S2 mimics S3 but the split component always bisects the largest com-
ponent (no inner box is computed) and S1 corresponds to S2 except that the
backward phase of the HC4Revise contractors is disabled in the contract compo-
nent (only a satisfaction test is done). The quality of a paving can be measured
by its cardinality (#X i , #X o ) and the volume of X i .
The introductory problem has been processed by S3 , S2 and S1 given = 0.1
and we respectively obtain pavings with cardinalities (330, 646), (738, 736) and
(570, 696). There are about the same number of outer boxes which are required
to enclose the frontier of the domain of definition at tolerance . However, the
sets of inner boxes depicted in Fig. 6 are much different. S3 is able to calculate
a small number of maximal inner boxes as compared to S2 and S1 . S1 generates
a regular paving (a quadtree of boxes here). S2 is able to contract every box,
which leads here to increase the number of inner boxes.
(S3 ) (S2 ) (S1 )
Fig. 6. The sets of inner boxes X i computed by the three strategies for the introductory
problem at tolerance = 0.1: 330 boxes for S1 , 738 for S2 and 570 for S3 . Their total
areas are respectively equal to 30.38, 29.95 and 29.74.
Another function involving an arccosine, a square root and a division has

been handled and the pavings computed by the three strategies at precision
= 0.01 are depicted in Fig. 7. Their cardinalities are respectively equal to
(1147, 3374), (3187, 3558) and (1896, 2871). The surfaces of the sets of inner
boxes are respectively equal to 6.962, 6.933 and 6.918. Once again, S3 generates
the best paving with only 1147 inner boxes covering a total area equal to 6.962.
S2 derives a paving with too many boxes as compared to the other strategies
but the area covered by the inner boxes 6.933 is slightly better than the one
obtained from S1 equal to 6.918.
(S3 ) (S2 ) (S1 )

√
Fig. 7. Given the real function f (x1 , x2 ) = acos(x2 − x21 ) + 1/ x1 + x2 and the box
2
Ω = [−5, 5] the figures above depict the pavings obtained from S3 , S2 and S1 using
the interval branch-and-contract algorithm applied to the CSP
C, Ω given the set of
domain constraints C = {−1 ≤ x2 − x21 ≤ 1, x1 + x2 > 0} generated from f .
These experiments suggest that combining the detection of maximal inner

boxes with branching is efficient. On the one hand, this strategy leads to max-
imize the volume of the set of inner boxes. On the other hand, no more than
two sub-boxes are generated at each branching step, which tends to minimize
the number of boxes explored during the search.
5 Discussion and Perspectives

We have presented an interval branch-and-contract algorithm that rigorously
calculates a paving of the domain of definition of a factorable real function. An
inner box is a guarantee for interval tests and interval operators that require
the continuity property, as motivated in [8] in the context of bound-constrained
global optimization. For an inclusion in other methods, it could be interesting
to extract from our work a domain contractor that returns a union of an inner
box included in the domain of definition of the function and an outer box.
The problem studied in this paper has been taken into account by the recent
IEEE 1788 standard for interval arithmetic [1]. This standard proposes to dec-
orate the intervals with different flags including a dac flag ensuring that an
108 L. Granvilliers
operation is defined and continuous on the given domain. Implementing a solver

on top of an IEEE 1788 compliant interval arithmetic library could then be useful
to assert that the result of an interval evaluation has the required property.
In the future, we plan to experiment several inflation techniques [4,6] and to
compare them with the currently implemented method based on the constraint
negations. It could be interesting to investigate other branching heuristics and to
associate suitable interval contractors with the domain constraints, for instance
contractors enforcing strong consistency techniques when those constraints are
complex with many occurrences of variables.
Acknowledgment. The author would like to thank Christophe Jermann for interest-
ing discussions about these topics and his careful reading of a preliminary version of
this paper.
References
1. 1788-2015, I.S.: IEEE Standard for Interval Arithmetic (2015)
2. Benhamou, F., Goualard, F.: Universally quantified interval constraints. In: Pro-
ceedings of International Conference on Principles and Practice of Constraint Pro-
gramming (CP), pp. 67–82 (2000)
3. Benhamou, F., Goualard, F., Granvilliers, L., Puget, J.F.: Revising hull and box
consistency. In: Proceedings of International Conference on Logic Programming
(ICLP), pp. 230–244 (1999)
4. Chabert, G., Beldiceanu, N.: Sweeping with continuous domains. In: Proceedings
of International Conference on Principles and Practice of Constraint Programming
(CP), pp. 137–151 (2010)
5. Chabert, G., Jaulin, L.: Contractor programming. Artif. Intell. 173(11), 1079–1100
(2009)
6. Collavizza, H., Delobel, F., Rueher, M.: Extending consistent domains of numeric
CSP. In: Proceedings of International Joint Conference on Artificial Intelligence
(IJCAI), pp. 406–413 (1999)
7. Fousse, L., Hanrot, G., Lefèvre, V., Pélissier, P., Zimmermann, P.: MPFR: a
multiple-precision binary floating-point library with correct rounding. ACM Trans.
Math. Softw. 33(2) (2007)
8. Granvilliers, L.: A new interval contractor based on optimality conditions for bound
constrained global optimization. In: Proceedings of International Conference on
Tools with Artificial Intelligence (ICTAI), pp. 90–97 (2018)
9. Granvilliers, L., Benhamou, F.: Algorithm 852: realpaver: an interval solver using
constraint satisfaction techniques. ACM Trans. Math. Softw. 32(1), 138–156 (2006)
10. Hentenryck, P.V., McAllester, D., Kapur, D.: Solving polynomial systems using a
branch and prune approach. SIAM J. Numer. Anal. 34(2), 797–827 (1997)
11. Lhomme, O.: Consistency techniques for numeric CSPs. In: Proceedings of Inter-
national Joint Conference on Artificial Intelligence (IJCAI), pp. 232–238 (1993)
12. Mackworth, A.K.: Consistency in networks of relations. Artif. Intell. 8, 99–118
(1977)
13. Moore, R.E.: Interval Analysis. Prentice-Hall (1966)
Leveraging Local Optima Network
Properties for Memetic Differential
Evolution
Viktor Homolya and Tamás Vinkó(B)
Department of Computational Optimization, University of Szeged, Szeged, Hungary

{homolyav,tvinko}@inf.u-szeged.hu
Abstract. Population based global optimization methods can be

extended by properly defined networks in order to explore the structure
of the search space, to describe how the method performed on a given
problem and to inform the optimization algorithm so that it can be more
efficient. The memetic differential evolution (MDE) algorithm using local
optima network (LON) is investigated for these aspects. Firstly, we report
the performance of the classical variants of differential evolution applied
for MDE, including the structural properties of the resulting LONs. Sec-
ondly, a new restarting rule is proposed, which aims at avoiding early
convergence and it uses the LON which is built-up during the evolution-
ary search of MDE. Finally, we show the promising results of this new
rule, which contributes to the efforts of combining optimization methods
with network science.
Keywords: Global optimization · Memetic differential evolution ·

Local optima network · Network science
1 Introduction
Consider the global optimization problem
min f (x), (1)

x∈D⊂IR
where f is a continuous function, which we aim to solve by the usage of memetic

differential evolution (MDE) [10]. Recent benchmarking results [1,5] show the
promising efficiency of MDE over challenging optimization problems. Differential
evolution (DE) is a well known iterative, population based algorithm [12] using
only the function value of f as information. Memetic approaches use local opti-
mization method in each and every iteration, hence the population members are
always local optima of the objective function [7,9]. MDE is a simple extension
of DE, the formal description of the algorithm is the following.

https://doi.org/10.1007/978-3-030-21803-4_11
110 V. Homolya and T. Vinkó
1. Start with a random population {p1 , . . . , pm } (pi ∈ Rn ).

2. For each pi iterate until the stopping conditions hold:
(a) Select three pairwise different elements from the population: pj , pk , pl ,
all different from pi .
(b) Let c = pj + F · (pk − pl ) be a candidate solution.
(c) Modify vector c applying a CR-crossover using vector pi .
(d) Execute a local search from vector c.
(e) Replace vector pi with vector c if f (c) ≤ f (pi ) holds.
As it can be seen, MDE has some parameters: m is the population size,

F ∈ (0, 2) is the differential weight and CR ∈ (0, 1) is the crossover probability.
In Step 2(c) the CR-crossover for the candidate solution c ∈ Rn means that
for all dimensions of c a number r is generated uniform at random in (0, 1). If
r > CR then the dimension of c is made equal to the same dimension of pi . To
guarantee getting a new vector c, the CR-crossover is skipped for a randomly
selected dimension, so the linear combination of the three other vectors in this
dimension is kept.
Our contributions can be summarized as follows. First, we numerically inves-
tigate the classical x /y/z variants in the context of MDE. Then the MDE algo-
rithm gets extended by the concept of local optima network (LON). In general,
LONs are graphs, in which the nodes correspond to local optima of the opti-
mization problem and the edges represent useful information either related to
the problem (e.g. critical points of f ) or to the optimization method in use. Simi-
larly to our earlier work [4], the directed edges of MDE LONs are formed in such
a way that they represent parent-child relation. Thus at the end of the MDE run,
we obtain a graph representation of how the method discovered the landscape
of the optimization problem. Apart from the standard performance metrics, we
also report and compare certain characteristics of the resulting LONs using some
global metrics. One of the detailed analysis is to show the relation between the
function values of nodes and the function values of their out-neighbors. Based
on this and some graph properties we propose an extension to the MDE which
can lead to better performance on the test functions used in this paper.
2 Definitions
2.1 Strategies
The most popular DE variants which apply different strategies are distinguished
by the notation DE/x/y/z, where
– x specifies the solution to be perturbed, and it can be either rand or best, i.e.,
a random one or the current best solution.
In the above algorithm description it defines the way to choose pj in Step
2(a).
Leveraging Local Optima Network Properties for Memetic DE 111
– y specifies the number of difference vectors (i.e. the difference between two
randomly selected and distinct population members) to be used in the per-
turbation done in Step 2(b), and its typical values are either 1 or 2.
The choice y = 1 is considered as default and hence Step 2(a) and 2(b) are
as already given in the description. In case of y = 2, then besides pk and pl ,
further two vectors, pm and pn are also selected in order to create another
differential vector.
– z identifies which probability distribution function to be used by the crossover
operator: either bin (as binomial) or exp (as exponential).
In bin choose randomly a dimension index d. In Step 2(c) the vector c modified
as for every e = d index let ce := pie with CR probability.
In exp choose randomly a dimension index d. Starting from d, step over every
e dimension and modify ce to pie . In every step with 1 − CR probability
finish the modification.
2.2 Local Optima Network

As it was already briefly described in the Introduction, given problem (1) a
local optima network (LON) is a graph in which the vertices are local optima
of function f and the edges are defined between vertices separated by a critical
point [13]. It is important to note that other kind of LONs can also be introduced
which then specifically depend on the optimization method in use as well. In our
work two vertices (local optimizers) are connected if they are in parent-child
relation, i.e., the parent vertex is the target vector, the base vector or a member
from the differential vector(s) and the child vertex is the result of the MDE
iteration with the mentioned vectors. The edges are directed to the children.
Loops are allowed, and the LON can be weighted to represent multi-edges.
Another possibility has been developed and analyzed in [11] for DE in which
the nodes are the population members and the weighted edges also represent
parent-child relation. However, the resulting network captures the evolution of
the population members, rather than the detection of the local optima.
2.3 Network Measures

It is expected that different MDE variants lead to different LONs at the end of
their runs. In order to characterize these differences we can use global measures
to characterize the entire graph, which are the followings.
– The number of nodes (N ) and edges (M );
– the diameter (D) is the length of the longest of all directed shortest paths;
– and the average degree (d) (the average in-degree is equal to the average
out-degree).
Larger N value means more local optima found. The diameter corresponds to
the maximal number of times when Step 2(e) gets fulfilled for a given population
member. Finally, for the average degree, for the y = 1 and y = 2 variants d < 3.5
and respectively d < 5 is an indication of early convergence.
3 Benchmarking the Classic Variants

The MDE and the LON creator and analyzer was implemented in Python with
Pyomo [3] and NetworkX [2] packages. The local solver in MDE was MINOS [8].
3.1 Test Functions

Following the numerical experiments done in [1,5] we tested the MDE variants
on two test functions:
– Rastrigin:
n

fR (x) = 10n + (x2i − 10 cos(2πxi )), x ∈ [−5.12, 5.12]n ,
i=1
which is a single-funnel function with 10n local minimizers, and its global
minimum value is 0.
– Schwefel:
n
fS (x) = −xi sin( |xi |), x ∈ [−500, 500]n ,
i=1
which is a highly multi-funnel function with 2n funnel bottom, and its global
minimum value is −418.98129n.
In fact, we used modified versions of these functions, namely we applied
shifting and rotation of Rastrigin: fR (W(x − x)), and rotation on Schwefel:
fS (W(x)), where W is an n-dimensional orthogonal matrix, and x is an n-
dimensional shift vector. These transformations result in even more challenging
test functions, as they are non-separable and their global minimizer points do
not lie in the center of their search space (as in the original versions).
3.2 Performance Metrics

After fixing the shift vectors and the rotation matrices for the test functions we
executed K = 50 independent runs for all MDE variations. The performance
metrics used to compare their efficiency are the followings.
– S is the percentage of success, i.e. how many times we reached the global
minimizer;
– ‘Best’ is the best function value found out of K runs;
– ‘Avg’ is the average of function values;
– ‘Adf’ is the average distance between the found function value and the global
optimum value in those runs where a failure occurred [5];
– ‘LS’ is the average number of local searches per successful run;
– ‘SP’ is the success performance [1], which is calculated as
K
mean(# local searches over successful runs) × .
# successful runs
Note that for all metrics, but for S, lower number indicates better performance.
3.3 Stop Conditions

The following stopping conditions were used:
– the sum of pairwise differences of the current population members’ values less
than 10−4 ;
– the population members had not replaced over the last 100 iterations;
– the best founded value did not changed during the last 20,000 local searches
(# iterations × m, where m is the population size).
3.4 Results
As it was already mentioned, we executed K = 50 independent runs for every
variants. Both the dimension and the population size was fixed to 20. The MDE
parameters were set up as F = 0.5 and CR = 0.1 in all experiments.
For the Rastrigin function the tested strategies resulted in different perfor-
mance metrics as it can be seen in Table 1. The most successful is the rand/2/bin
variant as it was able to find the global optimum in all cases. Overall the rand/y/z
strategies did quite well, except the rand/1/exp which resulted in the highest SP
value. Among the best/x/y ones the best/2/bin got the highest success rate and
the lowest SP value, whereas the best/1/exp did not succeed at all. Regarding the
LONs we can notice that the x/2/z strategies led to larger graphs, as expected.
This is a clear indication that these versions discover wider regions during the
optimization runs. Note that larger LONs, such as rand/2/z have not resulted
in larger diameters. The small size LONs of the best/1/z strategies and their low
average degree are evidences of early convergence to local optima.
Table 1. Performance and graph metrics for rotated and shifted Rastrigin-20
Rule S Best Avg Adf LS SP N M D d

best/1/bin 4 0 6.10 6.35 490 12250 358.5 1345.7 10.8 3.74
best/1/exp 0 1.98 10.51 10.5 ∞ ∞ 224.3 837.3 9.24 3.72
best/2/bin 44 0 0.73 1.31 1462.7 3324 1370.5 7809.7 12.52 5.69
best/2/exp 14 0 1.48 1.72 1042.8 7449 879.6 5002 12.02 5.68
rand/1/bin 54 0 0.69 1.51 1938.5 3590 1721.8 6954.0 14.38 4.03
rand/1/exp 12 0 2.54 2.88 1393 11611 1106.1 4504.6 14.22 4.07
rand/2/bin 100 0 0 0 6325 6325 6203.7 37212.3 14 5.99
rand/2/exp 92 0 0.09 1.24 3964.3 4309 3817.6 22901.3 13.38 5.99
As it was expected the Schwefel problem turned out to be much more chal-
lenging for the MDE versions, see Table 2. Only three out of eight strategies were
able to find the global optimum at least once. For this function rand/2/bin has
the largest success rate and the lowest Adf and SP values, being essentially better
than any other variants. However, the relative good performance of rand/2/bin
is related to the highest number of nodes and edges in its LONs, hence it spends
considerably more computational time than the others. An overall observation
Table 2. Performance and graph metrics for rotated Schwefel-20
rule S Best Avg Adf LS SP N M D d

best/1/bin 0 −7905.9 −7371.6 1007.9 ∞ ∞ 176.1 633.3 7.1 3.57
best/1/exp 0 −8142.7 −7204.9 1174.6 ∞ ∞ 100 341.7 5.9 3.39
best/2/bin 0 −8261.2 −7886.8 492.8 ∞ ∞ 1857.5 10573.4 7.5 5.64
best/2/exp 0 −8024.3 −7629.7 749.9 ∞ ∞ 821.7 4658.6 7.1 5.62
rand/1/bin 2 −8379.6 −7875.2 514.6 3520 176000 2186.8 8676.4 10.8 3.97
rand/1/exp 0 −8024.3 −7639.5 740.1 ∞ ∞ 931.7 3754.9 10.1 4.03
rand/2/bin 20 −8379.6 -8202.2 221.7 15408 77040 14946.1 88927.2 9.8 5.94
rand/2/exp 4 −8379.6 −8114.2 276.4 5530 138250 8763.3 52254.4 9.6 5.96
is that the diameters are certainly lower for the Schwefel problem than for the
Rastrigin. On the other hand, the average degree values are very similar for the
two problems.
4 MDE Supported by Network Analysis

Apart from reporting the LONs and analyzing their basic characteristics, we
aim at extending the MDE algorithm with rules exploiting network properties
which provide us with rich amount of information about how the execution of
the optimization method was done. There are lots of possibilities to do so, here
we report on one of them, which turns out to be useful to guide MDE towards
better performance. Based on the analysis reported below we can propose a
modified version of MDE.
Fig. 1. Function values of out-neighbors for fS with n = 20; the most successful runs
for: best/1/bin (left) and rand/1/bin (right)
During the MDE run the corresponding LON gets built-up and it is possible
to store the function values of the nodes. We can investigate the out-neighbors
of node u and compare their function values against u. Figure 1 contains two
plots of this kind, showing two different runs of two MDE variants. The x-axis
contains the function values of LON nodes with positive out-degree. Each dot
shows the function values of the out-neighbors. The straight line helps us to
notice the amount of neighbors with higher and lower function values for each
node. Having more dots above the line indicates that the MDE variant created
more children with worse function value from a given node. The side effect of this
behavior is the wider discovery of the search space, which can be quite beneficial
especially on multi-funnel functions such as Schwefel.
Although the rand/1/bin variant resulted in much larger LONs than the
best/1/bin ones, Fig. 1 clearly shows that rand/1/bin has relatively much more
dots above the line than below. For the other rand/y/z variants we obtained
similar figures, and we know from Table 2 that some of these variants were able
to find the global minimizer. On the other hand, best/1/bin got stuck in a local
minimizer point and from the plot we can see the sign of greedy behavior.
The fact that more successful variants can show similar behavior for the
single-funnel Rastrigin function is shown on Fig. 2. Greedy behavior for this
function could lead to better performance, nevertheless, even the most successful
run (in terms of best function value reached) of best/1/exp converged to a local
minimizer (left hand side on Fig. 2).
Fig. 2. Function values of out-neighbors for fR with n = 20; the most successful runs
for: best/1/exp (left) and rand/1/exp (right)
Based on these observations we are ready to propose an extension to MDE

using LON.
4.1 Above-Below Rule

To avoid the early converge we propose a restart-rule to be applied some mem-
bers of the population. Only the ones which generated the convergence are the
problems while the MDE did not explore enough parts of the search space, so
the nodes which have more neighbors below the line. Remove these members
from the population and add new random ones. Permanent restarting would
prevent the convergence, so the restart is applied only in every α-th iteration
of the MDE. We noticed that when the diameter of the LON is high enough,
the population visited fairly large part of the space, so it has good chances to
converge to the global optimum if we use the MDE without this modification.
We propose to extend the MDE algorithm in its Step 2 with the following
rule, which has three integer parameters, δ > 0, α > 0 and θ ≤ 0. If the diameter
of the current LON is lower than δ then in every α-th iteration for all pi do the
followings:
– collect the out-neighbors of pi into the set Ni ,
– calculate the function values of the elements of Ni ,
– let Nia := {q : f (q) > f (pi )}, and Nib := {q : f (q) < f (pi )}
– if |Nia | − |Nib | < θ then replace pi by a newly generated random vector.
Note that function values of the nodes are stored directly in the LON, so prac-
tically they need to be calculated only once.
4.2 Numerical Experiment

Using the above introduced rule we have done extensive benchmarking in order
to see the performance indicators. Our aim was to find a combination of the three
parameters which leads to improved efficiency. Hence we did a parameter sweep:
δ ∈ [6, 9], α ∈ [3, 6], and θ ∈ [−2, 0]. The choice for the interval from which
the values of δ are taken is motivated by the fact that, according to Tables 1
and 2 the diameters of the LONs for a given MDE variant are much larger for
Rastrigin than for Schwefel, and it never goes beyond 10 for fS . On the other
hand, when population members for fR are already having function values close
to 0, then it is unwise to make MDE exploring the search space.
We report the results of the experiments for n = 20 only. According to our
findings, the combination δ = 7, α = 3, θ = −1 led to the best performance
improvements for the tested functions. The indicators are reported in Tables 3
and 4, where improved metrics are highlighted by underline.
Table 3. Performance and graph metrics for rotated and shifted Rastrigin-20 using
the new rule

best/1/bin 0 0.99 5.56 5.56 ∞ ∞ 378.1 1424.3 11.52 3.76
best/1/exp 0 1.99 17.38 17.38 ∞ ∞ 226.4 843.5 9.52 3.71
best/2/bin 60 0 0.61 1.54 1449.3 2415 1349.8 7677.1 12.88 5.68
best/2/exp 18 0 2.39 2.92 993.3 5519 879.8 4989.7 12.16 5.66
rand/1/bin 60 0 0.55 1.39 1996.0 3327 1794.4 7244.1 14.96 4.03
rand/1/exp 14 0 2.21 2.57 1411.4 10082 1110.1 4521.1 14.1 4.07
rand/2/bin 100 0 0 0 6341.6 6342 6225.1 37318.3 14.64 5.99
rand/2/exp 98 0 0.03 1.78 3897.1 3977 3776.1 22641.9 14.04 5.99
We can see that our rule improved the percentage of success (S) for the
single-funnel Rastrigin function in up to 16%, and resulted in lower average
Table 4. Performance and graph metrics for rotated and shifted Schwefel-20 using the
new rule

best/1/bin 0 −8142.7 −7685.6 693.9 ∞ ∞ 278.1 1014.3 7.3 3.63
best/1/exp 0 −8024.3 −7464.8 914.8 ∞ ∞ 149.8 532.8 6.3 3.51
best/2/bin 0 −8261.2 −7993.3 386.2 ∞ ∞ 1991.5 11323.6 7.4 5.66
best/2/exp 0 −8261.2 −7834.0 545.6 ∞ ∞ 1731.6 9869.2 7.5 5.67
rand/1/bin 0 −8142.7 −7899.7 479.8 ∞ ∞ 2370.9 9408.0 11.2 3.97
rand/1/exp 0 −8261.2 −7639.2 740.4 ∞ ∞ 1065.6 4270.5 10.2 4.01
rand/2/bin 26 −8379.6 −8188.6 258.1 13524 52017 16304.1 96986.8 9.9 5.94
rand/2/exp 4 −8379.6 −8111.9 278.8 5850 146250 8384.2 50048.2 9.8 5.96
function values for six out of eight variants. We obtained 7% improvement in

success performance with best/2/bin.
For the multi-funnel Schwefel function the new rule does not help for the
variants which were unsuccessful in the original versions to find the global opti-
mum. However, it made them finding local optimum with lower function value
on average and hence decreased their ‘average difference failure’ measure. The
most efficient rand/2/bin variant got better in its SP measure by 32%.
5 Conclusions
To the best of our knowledge our paper is the first one reporting benchmark-
ing results on MDE variants. According to the numerical experiments, the
rand/2/bin strategy provides overall the best percentage of success metric, espe-
cially when it is applied on multi-funnel problem. This is somewhat in line with
the results reported in [6] for DE. For a single-funnel function the best/2/bin
variant can be advantageous if one needs good success performance, i.e. lower
computational time.
We have shown that incorporating certain knowledge on the local optima
network of the MDE to the evolutionary procedure can lead us to formalize
restarting rules to enhance the diversification of the population. Our numerical
tests indicates that the proposed restarting rule is beneficial on average for most
of the MDE variants.
In this work we have developed a computational tool in Python using Pyomo
and NetworkX packages which provide us with a general framework to discover
further possibilities on the field of (evolutionary) global optimization and net-
work science. We plan to extend our codebase with further MDE rules, in par-
ticular with those involve network centrality measures as selection [4].
Acknowledgment. This research has been partially supported by the project “Inte-
grated program for training new generation of scientists in the fields of computer sci-
ence”, no EFOP-3.6.3-VEKOP-16-2017-0002. The project has been supported by the
European Union and co-funded by the European Social Fund. Ministry of Human
Capacities, Hungary grant 20391-3/2018/FEKUSTRAT is acknowledged.
References
1. Cabassi, F., Locatelli, M.: Computational investigation of simple memetic
approaches for continuous global optimization. Comput. Oper. Res. 72, 50 – 70
(2016)
2. Hagberg, A., Swart, P., S Chult, D.: Exploring network structure, dynamics, and
function using NetworkX. Technical report, Los Alamos National Lab. (LANL),
Los Alamos, NM (United States) (2008)
3. Hart, W.E., Laird, C.D., Watson, J.P., Woodruff, D.L., Hackebeil, G.A., Nicholson,
B.L., Siirola, J.D.: Pyomo-Optimization Modeling in Python, vol. 67. Springer,
Heidelberg (2012)
4. Homolya, V., T.Vinkó: Memetic differential evolution using network centrality
measures. In: AIP Conference Proceedings 2070, 020023 (2019)
5. Locatelli, M., Maischberger, M., Schoen, F.: Differential evolution methods based
on local searches. Comput. Oper. Res. 43, 169–180 (2014)
6. Mezura-Montes, E., Velázquez-Reyes, J., Coello Coello, C.A.: A comparative study
of differential evolution variants for global optimization. In: Proceedings of the 8th
Annual Conference on Genetic and Evolutionary Computation, pp. 485–492. ACM
(2006)
7. Moscato, P.: On evolution, search, optimization, genetic algorithms and martial
arts: towards memetic algorithms. Caltech concurrent computation program. C3P
Rep. 826 (1989)
8. Murtagh, B.A., Saunders, M.A.: MINOS 5.5.1 user’s guide. Technical Report SOL
83-20R (2003)
9. Neri, F., Cotta, C.: Memetic algorithms and memetic computing optimization: a
literature review. Swarm Evol. Comput. 2, 1–14 (2012)
10. Piotrowski, A.P.: Adaptive memetic differential evolution with global and local
neighborhood-based mutation operators. Inf. Sci. 241, 164–194 (2013)
11. Skanderova, L., Fabian, T.: Differential evolution dynamics analysis by complex
networks. Soft Comput. 21(7), 1817–1831 (2017)
12. Storn, R., Price, K.: Differential evolution - a simple and efficient heuristic for
global optimization over continuous spaces. J. Global Optim. 11, 341–359 (1997)
13. Vinkó, T., Gelle, K.: Basin hopping networks of continuous global optimization
problems. Cent. Eur. J. Oper. Res. 25, 985–1006 (2017)
Maximization of a Convex Quadratic
Form on a Polytope: Factorization and
the Chebyshev Norm Bounds
Milan Hladı́k1(B) and David Hartman2,3

1
Faculty of Mathematics and Physics, Department of Applied Mathematics,
Charles University, Malostranské nám. 25, 11800 Prague, Czech Republic
hladik@kam.mff.cuni.cz
https://kam.mff.cuni.cz/∼hladik
2
Computer Science Institute, Charles University, Malostranské nám. 25,
11800 Prague, Czech Republic
hartman@cs.cas.cz
3
Institute of Computer Science of the Czech Academy of Sciences, Prague,
Czech Republic
Abstract. Maximization of a convex quadratic form on a convex poly-

hedral set is an NP-hard problem. We focus on computing an upper
bound based on a factorization of the quadratic form matrix and employ-
ment of the maximum vector norm. Effectivity of this approach depends
on the factorization used. We discuss several choices as well as iterative
methods to improve performance of a particular factorization. We carried
out numerical experiments to compare various alternatives and to com-
pare our approach with other standard approaches, including McCormick
envelopes.
Keywords: Convex quadratic form · Relaxation · NP-hardness ·

Interval computation
1 Introduction
We consider one of the basic global optimization problems [6,9,15,16], maxi-
mization of a convex quadratic form on a convex polyhedral set
f ∗ = max xT Ax subject to x ∈ M. (1)
Herein, A ∈ Rn×n is symmetric positive semidefinite and M is a convex polyhe-

dral set described by a system of linear inequalities.
If M is bounded, global optimum is attained at a vertex of M [9]. This
makes the problem computationally intractable. It is NP-hard even when M
is a hypercube [11,17] and for other special cases [4]. There are also identified
polynomially solvable sub-classes [1].
Supported by the Czech Science Foundation Grant P403-18-04735S.
https://doi.org/10.1007/978-3-030-21803-4_12
120 M. Hladı́k and D. Hartman
There are various methods developed for solving (1). This includes cutting
plane methods [10], reformulation-linearization/convexification and branch &
bound methods [3,15], among others. Polynomial time approximation methods
also exist [18]. There are many works on quadratic programming [7] and concave
function minimization [14] to give a more detailed state-of-the-art.
In this paper, we focus on computation of a cheap upper bound on f ∗ . Tight
upper bounds are important, for instance, when quadratic functions in a non-
linear model are relaxed. Maybe more importantly, tight bounds are crucial for
effectivity of a branch & bound approach when solving nonlinear optimization
problems.
Notation. For a matrix A, we use Ai,∗ to denote its ith row. Inequalities and
absolute values are applied entry-wise for vectors and matrices. The vector of
ones is denoted by e = (1, . . . , 1)T and the identity matrix
√ of size n by In . We
use two vector norms, the Euclidean norm x2 = xT x and the maximum
(Chebyshev) norm x∞ = maxi {|xi |}. For a matrix M ∈ Rn×n , we use the
induced maximum norm M ∞ = maxi j |Mij |.
Factorization. Matrix A can be factorized as A = GT G. Then xT Ax =
xT GT Gx = Gx22 and we can formulate the problem as maximization of the
squared Euclidean norm
max Gx22 subject to x ∈ M. (2)
Upper bound. Replacing the Euclidean norm by another norm, we obtain an

approximation on the optimal value. Equivalence of vector norms gives us the
guaranteed bounds. In particular, we utilize the maximum norm
f ∗ = max Gx22 ≤ n · max Gx2∞ ≡ g ∗ (G). (3)

x∈M x∈M
The upper bound g ∗ (G) is effectively computable by means of linear program-

ming (LP). Write
g ∗ (G) = n · max Gx2∞ = n · max max (Gi,∗ x)2 .

x∈M i x∈M
The inner optimization problem maxx∈M Gi,∗ x has the form of an LP problem,
and we have to solve maxx∈M ±Gi,∗ x for each i = 1, . . . , n. So in order to
calculate g ∗ (G), it is sufficient to solve 2n LP problems in total.
Quality of the upper bound g ∗ (G) depends on the factorization A = GT G.
Our problem thus reads:
Find the factorization A = GT G such that the upper bound (3) is as tight
as possible.
2 Methods
There are two natural choices for the factorization A = GT G:
Maximization of a Convex Quadratic Form: Factorization and Bounds 121
– Cholesky decomposition A = GT G, where G is upper triangular with non-

negative diagonal.
– Square root A = G2 , where G is symmetric positive semidefinite.
Denote by H the set of orthogonal matrices of size n. Let H ∈ H and denote
R := HG. Then RT R = (HG)T HG = GT G = A is another factorization of A.
This motivates us to seek for suitable H ∈ H such that g ∗ (HG) gives a tight
upper bound.
An important sub-class of orthogonal matrices are Householder matrices. Let
u ∈ Rn \ {0}, then the corresponding Householder matrix is defined as
2
H(u) = In − uuT .
uT u
Each orthogonal matrix can be factored into a product of at most n Householder
matrices, so there is no loss of generality to restrict to Householder matrices only.
The upper bound (3) needn’t be tight because the maximum norm overesti-
mates the Euclidean norm. The overestimation vanishes for vectors the entries of
which are the same in absolute value, that is, y22 = ny2∞ for each y ∈ {±1}n
and its multiples.
This brings us to the following heuristic: Find H ∈ H such that HG has
possibly constant row absolute sums. To this end, denote y := |G|e. Let H ∈ H
be a Householder matrix transforming y to α · e, where α := √1n y2 . Thus we
have Hy = α · e. The matrix H can be constructed simply as the Householder
matrix H(u) with u := α · e − y.
In general, there is no guarantee that the resulting matrix HG has con-
stant row absolute sums and gives tighter bounds. We can, however, iterate this
procedure to obtain more promising candidates. Thus we suggest the following
iterative method.
Algorithm 1. (Factorization A = RT R)
Input: Let A = GT G be an initial factorization.
1: Put R := G.
2: Put y := |R|e.
3: Put α := √1n y2 .
4: Put H := H(α · e − y).
5: If HR∞ < R∞ , put R := HR and go to step 2.
Output: factorization A = RT R.
Alternative Approaches
In order to carry out a comparison, we consider three alternative methods:
Exact method by enumeration. The optimal value f ∗ is attained for a vertex of
the feasible set M. Thus, to compute f ∗ , we enumerate all vertices of M and
take the maximum. Due to high computational complexity of this method, we
use it in small dimensions only.
Trivial upper bound. Let x, x ∈ Rn be lower and upper bounds on M, respec-

tively. That is, for each x ∈ M we have x ≤ x ≤ x. Then an upper bound
is simply calculated by interval arithmetic [8,13]. Let x := [x, x] be the corre-
sponding interval vector and evaluate f = [f , f ] = xT Ax. Then f ∗ ≤ f .
In order that the upper bound is tight, we use the interval hull of M in our
experiments. That is, x is the smallest interval vector enclosing M. This can be
computed by solving 2n LP problems, each of them calculating the minimum or
maximum in a particular coordinate. The computational effort can be further
reduced by a suitable order of LP problems to solve [2].
McCormick envelopes. We relax the quadratic term xT Ax using the standard
McCormick envelopes [5,12]. As above, let x, x ∈ Rn be lower and upper bounds
on M, respectively. Split A into the positive and negative parts A = A+ − A− ,
A+ , A− ≥ 0. Then
xT A+ x ≤ xT A+ x + xT A+ x − xT A+ x
= (x + x)T A+ x − xT A+ x = 2xTc A+ x − xT A+ x,
and
xT A− x ≥ xT A− x + xT A− x − xT A− x = 2xT A− x − xT A− x,
xT A− x ≥ xT A− x + xT A− x − xT A− x = 2xT A− x − xT A− x.
Now, an upper bound on f ∗ can be computed by the LP problem
max z subject to z ≤ 2xTc A+ x − xT A+ x − 2xT A− x + xT A− x,

z ≤ 2xTc A+ x − xT A+ x − 2xT A− x + xT A− x,
x ∈ M,
or in the standard form,
max z subject to 2(xT A− − xTc A+ )x + z ≤ −xT A+ x + xT A− x,

2(xT A− − xTc A+ )x + z ≤ −xT A+ x + xT A− x,
x ∈ M.
3 Comparison and Numerical Experiments

We carried out series of numerical experiments to compare the methods pre-
sented. For a given dimension n, we constructed randomly matrix A ∈ Rn×n as
A := GT G, where the entries of G ∈ Rn×n were generated randomly in [−1, 1]
with uniform distribution. The feasible set M is described by n2 inequalities.
An inequality aT x ≤ b is generated such that ai s are chosen randomly uniformly
in [−1, 1] and b is chosen randomly uniformly in [0, eT |a|]. For larger dimensions
(n ≥ 70), we make zero randomly selected 80% of entries of the constraint matrix
and run the computations in sparse mode.
For small dimensions, effectivity of a method is evaluated relatively to the
exact method. That is, we record the ratio bm /f ∗ , where bm is the upper bound
by the given method and f ∗ is the optimal value. For higher dimensions, the
exact method is too time consuming, so effectivity of a method is evaluated
relatively to the trivial method. That is, we record the ratio bm /btriv , where bm
is the upper bound by the given method and btriv is the upper bound by the
trivial method.
The computations were carried out in MATLAB R2017b on a eight-processor
machine AMD Ryzen 7 1800X, with 32187 MB RAM. The symbols used in the
tables have the following meaning:
– runs: the number of runs, for which the mean values in each row are com-
puted;
– triv: the trivial upper bound using the interval hull of M;
– McCormick: the upper bound using McCormick relaxation and the interval
hull of M;
– sqrtm: our upper bound using G as the square root of A;
– sqrtm+it: our upper bound using G as the square root of A and iterative
modification of G by means of Algorithm 1;
– chol: our upper bound using G from the Cholesky decomposition of A;
– chol+it: our upper bound using G from the Cholesky decomposition of A
and iterative modification of G by means of Algorithm 1;
– chol+rand: our upper bound using G from the Cholesky decomposition of A
and iterative improvement of G by trying 10 random Householder matrices.
Small dimension. Table 1 compares the effectivities for small dimensions, and
Table 2 displays the corresponding running times. By definition, effectivity of
the exact method is 1. From the results we see that for a very small dimension,
the best strategy is to compute the exact optimal value – it is the tightest and
fastest method. As the dimension increases, computation of the exact optimal
value becomes more time consuming. The running times of the upper bound
methods are more-or-less the same. Of course, chol+rand is about ten times
slower since it run ten instances.
Our approach is more effective with respect to tightness provided a suitable
factorization is used. The square root of A behaves better than the Cholesky
decomposition on average. Algorithm 1 can improve the performance of the
Cholesky approach, but not that of the square root one. The random generation
of Householder matrices has the best performance, indicating that there is a high
potential of using a suitable factorization. On average, the random Householder
matrix generation performs similarly when applied on sqrtm or on chol, so we
numerically tested only the latter.
As the dimension increases, all the bounds (the trivial ones, the McCormick
ones and our bounds) tend to improve. This is rather surprising, and we have
no complete explanation for this behaviour. It seems to affected by the geom-
etry of the convex polyhedron in connection with the way how the bounds are
constructed.
Table 1. Efficiency of the methods – small dimensions. The best efficiencies highlighted
in boldface.
n runs triv McCormick sqrtm sqrtm+it chol chol+it chol+rand

3 100 65.55 51.17 65.22 67.52 78.33 75.12 48.96
5 100 24.01 19.31 25.20 23.16 33.54 27.43 18.98
7 100 26.47 21.90 20.63 21.36 28.15 23.26 16.59
9 20 19.57 16.48 14.90 14.83 19.81 13.65 11.27
10 20 22.26 18.75 13.25 13.54 19.75 14.08 11.92
Table 2. Computational times of the methods (in 10−3 s) – small dimensions.
n runs exact triv McCormick sqrtm sqrtm+it chol chol+it chol+rand

3 100 0.8256 38.83 44.87 36.95 36.94 36.78 36.85 36.96
5 100 101.5 64.10 69.79 61.10 61.60 61.19 61.39 616.1
7 100 7160 91.87 97.62 89.01 88.86 88.48 88.01 887.7
9 20 141900 119.1 123.8 114.8 115.2 115.0 114.6 1145
10 20 240000 132.3 137.7 126.4 126.9 125.2 125.9 1257
Higher dimension. Tables 3 and 4 show the results for higher dimensions. By
definition, effectivity of the trivial method is 1. For smaller n, random House-
holder matrix generation performs best, but for larger n the number of random
matrices is not sufficient and the winner is the square root of A. Sometimes,
its tightness is improved by additional iterations, but not always. Again, the
computation times are very similar to each other. This is not surprising since all
the methods basically need to solve 2n LP problems.
For n ≥ 70, we run the computations in sparse mode. We can see from the
tables that the calculations took lower time due to the sparse mode. With respect
to the efficiencies, the methods perform similarly as in the previous dense case.
Again, as the dimension increases, our bounds tend to improve. Since we
related the displayed efficiency of the bounds to the trivial ones, this behaviour
might be caused by the worse quality of the trivial bounds in higher dimensions.
4 Conclusion
We proposed a simple and cheap method to compute an upper bound for a prob-
lem of maximization of a convex quadratic form on a convex polyhedron. The
method is based on a factorization of the quadratic form matrix and application
of Chebyshev vector norm.
The numerical experiments indicate that (at least for the randomly generated
instances) with basically the same running time, the method gives tighter bounds
than the trivial method or than the McCormick relaxation approach. For small
dimensions, the performance of all the considered approximation methods was
low even in comparison with exact optimum computation. However, in medium
or larger dimensions, the effectivity of our approach becomes very significant.
Therefore, it may serve as a promising approximation method for solving large-
Table 3. Efficiency of the methods – higher dimensions. The best efficiencies high-
lighted in boldface. The bottom part run in sparse mode.

20 100 1 0.8737 0.4614 0.4625 0.6682 0.5013 0.4260
30 100 1 0.8879 0.3730 0.3731 0.5587 0.4046 0.3582
40 100 1 0.9019 0.3170 0.3170 0.4707 0.3471 0.3216
50 100 1 0.9102 0.2725 0.2719 0.4273 0.3113 0.2940
60 100 1 0.9196 0.2396 0.2401 0.3806 0.2781 0.2692
70 20 1 0.9101 0.2709 0.2709 0.4344 0.3133 0.3062
80 20 1 0.9127 0.2445 0.2445 0.3905 0.2923 0.2900
90 20 1 0.9201 0.2237 0.2237 0.3604 0.2845 0.2779
100 20 1 0.9229 0.1993 0.1993 0.3496 0.2706 0.2677
Table 4. Computational times of the methods (in seconds) – higher dimensions. The
bottom part run in sparse mode.

20 100 0.4686 0.4799 0.4587 0.4575 0.4601 0.4573 4.583
30 100 2.115 2.150 2.075 2.073 2.087 2.087 20.80
40 100 7.889 7.983 7.735 7.725 7.812 7.780 77.74
50 100 25.16 25.44 24.71 24.72 24.93 24.85 248.4
60 100 64.89 63.97 63.97 64.19 64.92 64.43 641.1
70 20 12.36 12.57 12.99 12.94 12.89 13.25 131.2
80 20 24.09 24.23 24.61 24.64 25.34 25.19 251.5
90 20 43.97 44.10 45.71 45.45 46.25 46.62 465.9
100 20 78.92 79.77 84.74 84.22 85.08 86.19 855.7
scale problems. Indeed, the larger the dimension, the tighter bounds we got
compared relatively to the trivial or McCormick ones.
In the future, it would be also interesting to compare our approach to other
approximation methods, including the state-of-the-art technique of semidefinite
programming.
As an open problem, it remains the question of finding a suitable factoriza-
tion. In our experiments, the square root approach behaves best. Algorithm 1
can sometimes slightly improve tightness of the resulting bounds with almost
no additional effort. Nevertheless, as the numerical experiments with random
Householder matrices suggest, there is a high potential of achieving even better
results. The problem of finding the best factorization is challenging – so far,
there are no complexity theoretical results or any kind of characterization.
References
1. Allemand, K., Fukuda, K., Liebling, T.M., Steiner, E.: A polynomial case of uncon-
strained zero-one quadratic optimization. Math. Program. 91(1), 49–52 (2001)
2. Baharev, A., Achterberg, T., Rév, E.: Computation of an extractive distillation
column with affine arithmetic. AIChE J. 55(7), 1695–1704 (2009)
3. Bazaraa, M.S., Sherali, H.D., Shetty, C.M.: Nonlinear Programming. Theory and
Algorithms. 3rd edn. Wiley, Hoboken (2006)
4. Černý, M., Hladı́k, M.: The complexity of computation and approximation of the
t-ratio over one-dimensional interval data. Comput. Stat. Data Anal. 80, 26–43
(2014)
5. Floudas, C.A.: Deterministic Global Optimization. Theory, Methods and Applica-
tions, Nonconvex Optimization and its Applications, vol. 37. Kluwer, Dordrecht
(2000)
6. Floudas, C.A., Visweswaran, V.: Quadratic optimization. In: Horst, R., Parda-
los, P.M. (eds.) Handbook of Global Optimization, pp. 217–269. Springer, Boston
(1995)
7. Gould, N.I.M., Toint, P.L.: A quadratic programming bibliography. RAL Internal
Report 2000-1, Science and Technology Facilities Council, Scientific Computing
Department, Numerical Analysis Group, 28 March, 2012. ftp://ftp.numerical.rl.
ac.uk/pub/qpbook/qp.pdf
8. Hansen, E.R., Walster, G.W.: Global Optimization Using Interval Analysis, 2nd
edn. Marcel Dekker, New York (2004)
9. Horst, R., Tuy, H.: Global Optimization: Deterministic Approaches. Springer, Hei-
delberg (1990)
10. Konno, H.: Maximizing a convex quadratic function over a hypercube. J. Oper.
Res. Soc. Jpn 23(2), 171–188 (1980)
11. Kreinovich, V., Lakeyev, A., Rohn, J., Kahl, P.: Computational Complexity and
Feasibility of Data Processing and Interval Computations. Kluwer, Dordrecht
(1998)
12. McCormick, G.P.: Computability of global solutions to factorable nonconvex pro-
grams: Part I - Convex underestimating problems. Math. Program. 10(1), 147–175
(1976)
13. Moore, R.E., Kearfott, R.B., Cloud, M.J.: Introduction to Interval Analysis. SIAM,
Philadelphia (2009)
14. Pardalos, P., Rosen, J.: Methods for global concave minimization: a bibliographic
survey. SIAM Rev. 28(3), 367–79 (1986)
15. Sherali, H.D., Adams, W.P.: A Reformulation-Linearization Technique for Solving
Discrete and Continuous Nonconvex Problems. Kluwer, Boston (1999)
16. Tuy, H.: Convex Analysis and Global Optimization. Springer Optimization and Its
Applications, vol. 110, 2nd edn. Springer, Cham (2016)
17. Vavasis, S.A.: Nonlinear Optimization: Complexity Issues. Oxford University Press,
New York (1991)
18. Vavasis, S.A.: Polynomial time weak approximation algorithms for quadratic pro-
gramming. In: Pardalos, P.M. (ed.) Complexity in Numerical Optimization, pp.
490–500. World Scientific Publishing, Singapore (1993)
New Dynamic Programming Approach
to Global Optimization
Anna Kaźmierczak(B) and Andrzej Nowakowski
Faculty of Mathematics and Computer Science, University of Lodz,

Banacha 22, 90-238 Lodz, Poland
{anna.kazmierczak,andrzej.nowakowski}@wmii.uni.lodz.pl
Abstract. The paper deals with the problem of finding the global min-
imum of a function in a subset of Rn described by values of solutions to
a system of semilinear parabolic equations. We propose a construction
of a new dual dynamic programming to formulate a new optimization
problem. As a consequence we state and prove a verification theorem for
the global minimum and investigate a dual optimal feedback control for
the global optimization.
Keywords: Global optimization · Dynamic programming ·

Feedback control
1 Introduction
In classical optimization problem, our aim is to minimize a real valued objective
function, defined on a subset of an Euclidean space, which is determined by a
family of constraint functions. Depending on the type of those functions: linear,
convex, nonconvex and nonsmooth, different tools from analysis and numerical
analysis can be applied in order to find the minimum (or approximate minimum)
of the objective function (see e.g. [1]). However some sets, interesting from prac-
tical point of view, are very difficult to describe by constraints. Sometimes such
problematic sets can be characterized as controllability sets of dynamics e.g. dif-
ferential equations depending on controls. The aim of this paper is to present one
of such dynamics–a system of parabolic differential equations and to construct
a new dynamic programming to derive verification theorem for the optimization
problem. As a consequence, we can define a dual feedback control and an optimal
dual feedback and state a theorem regarding sufficient optimality conditions in
terms of the feedback control.

https://doi.org/10.1007/978-3-030-21803-4_13
New Dynamic Programming Approach to Global Optimization 129
2 The Optimization Problem

Let P ⊂ Rn an let R be a function defined on P, i.e. R : P → R. Consider the
following optimization problem R:
minimize R(x) on P.
Notice that nothing is assumed on R and the set P can be very irregular. It is
not an easy work to study such a problem and, in fact, theory of optimization
does not offer suitable tools to perform that task. We develop a new method
to handle the problem R. To this effect, we transform R to the language of
optimal control theory. Let us introduce Ω–an open, bounded domain in Rn of
the variables z, a compact set U ⊂ Rm (m ≥ 1) and an interval [0, T ]. Define a
family

U = u(t, z), (t, z) ∈ [0, T ] × Ω : u ∈ L1 ([0, T ] × Ω), u(t, z) ∈ U
of controls and a function f : [0, T ] × Ω × U → Rn , sufficiently regular (at

least Lipschitz continuous), consisting a nonlinearity of a system of parabolic
differential equations
xt (t, z) − Δx(t, z) = f (t, z, x(t, z), u(t, z)) , (t, z) ∈ [0, T ] × Ω,

(1)
x(0, z) = x0 (z), z ∈ Ω.
The regularity of f should ensure existence of solutions to (1), belonging to the

Sobolev space (W 1,2 ([0, T ] × Ω))n ∩ (C([0, T ] × Ω))n for x0 (·) ∈ (C0 (Ω))n and
each control u ∈ U. Then, using (1) and with the proper choice of U , we can
characterize P as:

P={ x(T, z)dz∈Rn : x is a solution to (1) for u ∈ U}.
Ω
Hence, R is now transformed to a well known problem of the optimal control

theory, which reads:
minimize R( x(T, z)dz),
Ω
subject to
xt (t, z) − Δx(t, z)) = f (t, z, x(t, z)), u(t, z)) , (t, z) ∈ [0, T ] × Ω, (2)
x(0, z) = x0 (z), z ∈ Ω, u ∈ U. (3)

We denote that problem by Rc and stress that Rc still regards minimizing the
function R on the set P. Thus finding a solution to Rc is equivalent to finding
a solution to the original problem R. However, to solve R we will deal with Rc
and develop suitable new tools to study Rc . In this way we want to omit lack of
regularity of the problem R.
The set of all pairs (x, u) satisfying (2), (3) we denote by Ad. Note that
the problem Rc is, in fact, a classical optimal control problem with distributed
130 A. Kaźmierczak and A. Nowakowski
parameters u. Thus we can apply tools from optimal control theory. Of course,
one may wonder whether this machinery is too complicated to optimize R on
P. All depends on the type of the set P is, as well as on how smooth is the
function R. If R and P are regular enough, we have many instruments in theory
of optimization to solve the problem R also numerically, but when there is no
sufficient regularity, then these methods are very complicated or, in case of very
bad data, cannot be adopted. In order to derive verification conditions, in fact,
sufficient optimality conditions for Rc , we develop quite a new dual method
basing on ideas from [3]. Using that dual method we also construct a new dual
optimal feedback control for Rc . Essential point of the proposed approach is that
we do not need any regularity of R on P as we move all considerations related
to Rc to extended space.
Remark 1. Notice that if we omit the integral in the definition of P, then P
will become a subset of an infinite dimensional space, but the method developed
in subsequent sections can be applied also to that case (i.e. to the problem of
finding a minimum in a subset of an infinite dimensional space).
3 Dual Approach to Rc
The dual approach to optimal control problems was first introduced in [3] and
then developed in several papers to different problems of that kind, governed
by elliptic, parabolic and wave equations (see e.g. [2,4]). In that method we do
not deal directly with a value function but with some auxiliary function, defined
in an extended set, satisfying a dual dynamic equation, which allows to derive
verification conditions for the primal value function. One of the benefits of that
technique is that we do not need any properties of the value function, such as
smoothness or convexity. In this paper we want to construct a new dual method
to treat the problem Rc . We start with the definition of a dual set: P ⊂ Rn –an
open set of the variables p. The set P is chosen by us! Let P ⊂ R2n+1 be an
open set of the variables (t, z, p), (t, z) ∈ [0, T ] × Ω, p ∈ P, i.e.
P = {(t, z, p) ∈ R2n+1 : (t, z) ∈ [0, T ] × Ω, p ∈ P}. (4)
Why do we extend the primal space of (t, z) variables? In classical approach

to necessary optimality conditions of Pontryagin maximum principle, both in
one variable and with distributed parameters, we work with the space of vari-
ables (t, z) and with so called conjugate variable (y0 , p)–multiplier. In the dual
dynamic programming (y0 , p) is nothing more as just the multiplier p staying
by constraints and y0 staying by the functional. However, the novelty in our
method is that we move all our study to that extended space (t, z, p), but we do
not use p as a multiplier and drop out the multiplier y0 .
Denote by W 1:2 (P ) the specific Sobolev space of real valued functions of
the variables (t, z, p), having the first order derivative with respect to t, and
the second order weak or generalized derivative (in the sense of distributions)
with respect to z. Our notation for the function space is used for the function
depending on the primal variable (t, z), and the dual variable p. The primal and
the dual variables are independent and the functions in the space W 1:2 (P ) enjoy
different properties with respect to (t, z) and p. The strategy of dual dynamic
programming consists in building all notions in the dual space–this concerns also
a dynamic programming equation. Thus the question is: how to construct that
equation in our case? The answer is not easy and not unique: on the left hand side
of (2) there is a linear differential operator, which concerns a state x. Certainly,
the auxiliary function V has to be real valued, as it must relate somehow to
a value function. This implies that the system of dynamic equations has to be
composed of one equation only, despite that (2) is a system of n equations.
The main problem is to choose a proper differential operator for the auxiliary
function V and a correct Hamiltonian, as these choices depend on themselves.
We have decided that in our original approach it is better to apply for V the
parabolic operator ∂/∂t − Δ only. We state the dynamic equation in a strong
form (see 5). We should stress that this equation is considered in the set P , i.e.
in the set of the variables (t, z, p).
Therefore, we require that a function V (t, z, p), V ∈ W 1:2 (P ), satisfies, in
P, for some y 0 ∈ L2 ([0, T ] × Ω), continuous in t, a parabolic partial differential
equation of dual dynamic programming of the form:
∂
∂t V(t, z, p) − Δz V (t, z, p) − inf{pf (t, z, V (t, z, p), u), u ∈ U }
∂
= ∂t V (t, z, p) − Δz V (t, z, p) − pf (t, z, V (t, z, p), u(t, z, p)) (5)
= y 0 (t, z), (t, z, p) ∈ P,
as well the initial condition

y (T, z)dz ≤ R( pV (T, z, p)dz), p ∈ P,
0
(6)
Ω Ω
where u(t, z, p) is a function in P , for which the infimum in (5) is attained.

Since the function f is continuous and U is a compact set, then u(t, z, p) exists
and is continuous. Denote by p(t, z), (t, z) ∈ [0, T ] × Ω, p ∈ L2 ([0, T ] × Ω),
a new trajectory having the property that for some u ∈ U and some y(·) ∈
L2 ([0, T ] × Ω), y ≤ y 0 , p is a solution to the following equation:
∂
∂t V (t, z, p(t,z)) − Δz V (t, z, p(t, z)) − p(t, z)f (t, z, V (t,z, p(t, z)), u(t, z))
(7)
= y(t, z),
while V (t, z, p), is a solution to (5). We will call p(·) a dual trajectory, while x(·)
stands for a primal trajectory. Moreover, we say that a dual trajectory p(·) is
dual to x(·), if both are generated by the same control u(t, z). Further, we confine
ourselves only to those admissible trajectories x(·), which satisfy the equation:
x(t, z) = p(t, z)V (t, z, p(t, z)) (for (t, z) ∈ [0, T ] × Ω). Thus denote
AdV = {(x, u) ∈ Ad : there exist p ∈L2 ([0, T ] × Ω), dual to x(t, z)
and such that x(t, z) = p(t, z)V (t, z, p(t, z)), for (t, z) ∈ [0, T ] × Ω}.
Actually, it means that we are going to study the problem Rc possibly in
some smaller set AdV , which is determined by V . All the above was simply the
precise description of the family AdV . This means we must reformulate Rc to:

RV = inf R( x(T, z)dz). (8)
(x,u)∈AdV Ω
V
We name R the dual optimal value, in contrast to the optimal value

Ro = inf R( x(T, z)dz),
(x,u)∈Ad Ω
as R depends strongly upon dual trajectories p(t, z) which, in fact, determine

V
the set AdV . Moreover, essential point is that the set AdV is, in general, smaller
than Ad, i.e. AdV ⊂ Ad, so the dual optimal value RV may be greater than
the optimal value Ro , i.e. RV ≥ Ro . In order to find the set AdV , first we must
find the function V , i.e. solve equation (5) and then define the set of admissible
dual trajectories. It is not an easy work, but it permits to assert that suspected
trajectory is really optimal with respect to all trajectories lying in AdV . This
fact is presented in the literature for the first time.
Remark 2. We should not bother about the problem R, if AdV is strictly smaller
than Ad, since the given P can be characterized with the help of the set AdV . In
practice, we extend Ad in order that (a possibly smaller set) AdV corresponds
precisely to P.
4 Sufficient Optimality Conditions for the Problem (8)

Below we formulate and prove the verification theorem, which gives sufficient
conditions for the existence of the optimal value RV , as well as for the optimal
pair (relative to the set AdV ).
Theorem 1. Assume that there exists a W 1:2 (P )–solution V of (5), (6) on
P , i.e. there exists y 0 ∈ L2 ([0, T ] × Ω) such that V fulfills (5) and (6). Let
p̄ ∈ L2 ([0, T ] × Ω), with the corresponding ū(t, z), satisfy (7) and let

y (T, z)dz = R( p̄(T, z)V (T, z, p̄(T, z))dz.
0
(9)
Ω Ω
Moreover, assume that x̄(t, z) = p̄(t, z)V (t, z, p̄(t, z)), (t,z) ∈ [0, T ]×Ω, together
with ū, belong to AdV .
Then (x̄(·), ū(·)) is the optimal pair relative to all (x(·), u(·)) ∈ AdV .
Proof. Let us take any (x(·), u(·)) ∈ AdV and p(·) generated by u(·), i.e. such
that (u(t, z),p(t, z)), (t, z) ∈ [0, T ] × Ω, satisfy (7) for some y ∈ L2 ([0, T ] × Ω),
y ≤ y 0 . Hence, from definition of AdV , the control u(·) generates x(t,z) =
p(t,z)V (t, z, p(t,z)), (t, z) ∈ [0, T ] × Ω. Then, on the basis of (9) and (6), we can
write

R( Ωx̄(T, z, )dz) = R( Ω p̄(T, z)V (T, z,p̄(T, z))dz) = Ω y 0 (T, z)dz
≤ R( Ω p(T, z)V (T, z, p(T, z))dz) = R( Ω x(T, z, )dz),
which gives the assertion.
5 Feedback Control for the Problem Rc

In this section we present suitable notions to define an absolutely new optimal
dual feedback control for the problem Rc . After appropriate definitions we state
and prove sufficient conditions for optimality in terms of feedback control, which
follow from the verification theorem. Let us see that a suggestion for the feedback
appears by the definition of dual dynamic programming in (5).
A function u(t, z, p) in P is called a dual feedback control if there exists a
solution x(t, z, p) in P, x ∈ W 1:2 (P ), of the equation
xt (t, z, p) − Δx(t, z, p) = f (t, z, x(t, z, p), u(t, z, p)) , (t, z, p) ∈ P. (10)
A dual feedback control ū(t, z, p), (t, z, p) ∈ P , is named optimal if there exist:
(i) a function x̄(t, z, p), (t, z, p) ∈ P , x̄ ∈ W 1:2 (P ), satisfying (10) with ū(t, z, p),
(ii) V ∈ W 1:2 (P ), given by the relation x̄(t, z, p) = pV (t, z, p), satisfying (6) for
some y 0 ∈ L2 ([0, T ] × Ω) and defining
Adx̄ = {(x, u) ∈ Ad : x(t, z) = x̄(t, z, p(t, z)) for some p ∈ L2 ([0, T ] × Ω),
satisfying(7) with u(t, z) = ū(t, z, p(t, z))
and some y ∈ L2 ([0, T ] × Ω), y ≤ y 0 },
(iii) a dual trajectory p̄(·) ∈ L2 ([0, T ] × Ω), such that the pair
x̄(t, z) = x̄(t, z, p̄(t, z)), ū(t, z) = ū(t, z, p̄(t, z)), (t, z) ∈ [0, T ] × Ω,
is optimal relative to the set Adx̄ and p̄ satisfies (7) together with ū.
Next theorem asserts the existence of an optimal dual feedback control, again
in terms of the function V (t, z, p).
Theorem 2. Let ū(t, z, p) be a dual feedback control in P and x̄(t, z, p),
(t, z, p) ∈ P , be defined according to (10). Suppose that for some y 0 ∈ L2 ([0, T ]×
Ω), there exists a function V ∈ W 1:2 (P ), satisfying (6), and that
pV (t, z, p) = x̄(t, z, p), (t, z, p) ∈ P . (11)
Let p̄(·) ∈ L2 ([0, T ] × Ω), (t, z, p̄(t, z)) ∈ P , be such a function, that a pair
x̄(t, z) = x̄(t, z, p̄(t, z)), ū(t, z) = ū(t, z, p̄(t, z)) belongs to Adx̄ and p̄ satisfies
(7) with ū and V . Moreover, assume that

R( p̄(T, z)V (T, z, p̄(T, z))dz) (12)
Ω
= R( y 0 (T, z)dz).
Ω
Then ū(t, z, p), (t, z, p) ∈ P , is an optimal dual feedback control.
Proof. Take any function p(t, z), p ∈ L2 ([0, T ] × Ω), dual to x(t, z) =
x̄(t, z, p(t, z)) and such that for u(t, z) = ū(t, z, p(t, z)), (x, u) ∈ Adx̄ . By (11),
it follows that x(t, z) = p(t, z)V (t, z, p(t, z)) for (t, z) ∈ [0, T ] × Ω. Analogously
as in the proof of Theorem 1, Eqs. (6) and (12) give

R( p̄(T, z)V (T, z, p̄(T, z))dz) ≤ R( x(T, z, )dz). (13)
Ω Ω
As a conclusion from (13), we get

R( x̄(T, z, )dz) = R( p̄(T, z)V (T, z, p̄(T, z))dz)
Ω Ω

≤ inf R( x(T, z)dz),
(x,u)∈Adx̄ Ω
which is sufficient to show that ū(t, z, p) is an optimal dual feedback control, by

the above definition.
References
1. Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press
(2004)
2. Galewska, E., Nowakowski, A.: A dual dynamic programming for multidimensional
elliptic optimal control problems. Numer. Funct. Anal. Optim. 27, 279–289 (2006)
3. Nowakowski, A.: The dual dynamic programming. Proc. Am. Math. Soc. 116, 1089–
1096 (1992)
4. Nowakowski, A., Sokolowski, J.: On dual dynamic programming in shape control.
Commun. Pure Appl. Anal. 11, 2473–2485 (2012)
On Chebyshev Center of the Intersection
of Two Ellipsoids
Xiaoli Cen, Yong Xia(B) , Runxuan Gao, and Tianzhi Yang
LMIB of the Ministry of Education; School of Mathematics and System Sciences,

Beihang University, Beijing 100191, People’s Republic of China
yxia@buaa.edu.cn
Abstract. We study the problem of finding the smallest ball covering

the intersection of two ellipsoids, which is also known as the Chebyshev
center problem (CC). Semidefinite programming (SDP) relaxation is an
efficient approach to approximate (CC). In this paper, we first estab-
lish the worst-case approximation bound of (SDP). Then we show that
(CC) can be globally solved in polynomial time. As a by-product, one
can randomly generate Celis-Dennis-Tapia subproblems having positive
Lagrangian duality gap with high probability.
Keywords: Chebyshev center · Semidefinite programming ·

Approximation bound · Polynomial solvability · CDT subproblem
1 Introduction
We study the problem of finding Chebyshev center of the intersection of two

ellipsoids:
(CC) min max x − z2 , (1)

z x∈Ω

where (·) = (·)T (·) is the Euclidean norm of (·),

Ω := x ∈ Rn : Fi x + gi 2 ≤ 1, i = 1, 2 ,
and Fi ∈ Rmi ×n , gi ∈ Rmi for i = 1, 2. We assume one of the two ellipsoids is non-
degenerate so that Ω is bounded. To this end, we let F1 be of full column rank.
We also assume that Ω has at least one interior point. Without loss of generality,
we assume the origin 0 is an interior point of Ω, that is, gi < 1, i = 1, 2.
Under these assumptions, (CC) has an optimal solution (z ∗ , x∗ ). Then, z ∗ is the
Chebyshev center of Ω and the ball centered at z ∗ with radius x∗ − z ∗ is the
smallest ball covering Ω.
(CC) has a direct application in the bounded error estimation. Consider the
linear regression model Ax ≈ b where A is ill-conditioned. In order to stabilize
the estimation, a regularization constraint Lx2 ≤ η is introduced to restrict
https://doi.org/10.1007/978-3-030-21803-4_14
136 X. Cen et al.
x. Therefore, the admissible solutions to the linear system is given by the inter-
section of two ellipsoids [13]:
F = {x ∈ Rn : Lx2 ≤ η, Ax − b2 ≤ ρ}.
As a robust approximation of the true solution, Beck and Eldar [4] suggested
the Chebyshev center of F, which leads to the minimax optimization (CC).
(CC) is difficult to solve. Relaxing the inner nonconvex quadratic optimiza-
tion problem to its Lagrange dual (which can be reformulated as a semidefinite
programming (SDP) minimization), Beck and Eldar [4] proposed the SDP relax-
ation approach for (CC). Their numerical experiments demonstrated that this
approximation is “pretty good” in practice. Interestingly, when (CC) is defined
on the complex domain rather than the real space, there is no gap between (CC)
and this SDP relaxation since strong duality holds for the inner quadratic max-
imization with two quadratic constraints over complex domain [3]. The other
zero-duality case is reported in [2] when both ellipsoids are Euclidean balls and
n ≥ 2. The SDP relaxation approach was later extended by Eldar et al. [10]
to find the Chebyshev center of the intersection of multiple ellipsoids, where an
alternative derivation of the SDP relaxation was presented.
To the best of our knowledge, there is no particular global optimization
method for solving (CC). Moreover, the following two questions is unknown:
– The SDP relaxation has been shown “pretty good” only in numerical exper-
iments [4]. Is there any theoretical guarantee?
– Can (CC) be globally solved in polynomial time?
In this paper, we will positively answer the above two questions. In particular,
we establish in Sect. 2 the worst-case approximation bound of the SDP relaxation
of (CC). In Sect. 3, we propose a global optimization method to solve (CC) and
show that it can be done in polynomial time. As a by-product, in Sect. 4, we
show that based on (CC) one can randomly generate Celis-Dennis-Tapia (CDT)
subproblems having positive Lagrangian duality gap with high probability.
Notations. Let σmax (·) and σmin (·) be the largest and smallest singular
values of the matrix (·), respectively. Denote by In the n × n identity matrix.
v(·) denotes the optimal valueof the problem (·). For two n × n symmetric
n n
matrices A and B, Tr(AB) = i=1 j=1 aij bij returns the inner product of A
and B. A ()B means that the matrix A − B is positive (semi)definite. Let
0n and O be the n-dimensional zero vector and n × n zero matrix, respectively.
For a real number x, x
denotes the smallest integer larger than or equal to x.
2 SDP Relaxation and Its Approximation Bound
In this section, we provide a theoretical approximation bound of the SDP relax-

ation, which was first introduced in [4] and then re-established in [10] based on
an alternative approach.
On Chebyshev Center of the Intersection of Two Ellipsoids 137
We first introduce the SDP relaxation in a new simple way. Consider the
inner nonconvex maximization of (CC)
(QP(z)) max {xT x − 2z T x + z T z}. (2)
x∈Ω
The first step is to write its Lagrangian dual problem as in [4]:

(D(z)) min λ + z T z (3)
αi ≥0,λ
2 2
αi FiT Fi − In αi FiT gi + z
s.t. 2i=1
T T
2 i=1 0.
i=1 αi gi Fi + z i=1 (gi − 1)αi + λ
2
Combining (D(z)) with the outer minimization yields a convex relaxation of

(CC), i.e., minz v(D(z)). Then we reduce the convex relaxation to an SDP in a
new way. Notice that for any z, it holds that

T In −z
z z = min μ : 0 .
−z T μ
Consequently, we have
min v(D(z))
z
= min λ+μ
z,αi ≥0,λ,μ
2 2
αi FiT Fi T
i=1 αi Fi gi In −z
s.t. i=1 − 0,
2 T
i=1 αi gi Fi
2
i=1 (gi − 1)αi + λ + μ
2 −z T μ

In −z
0,
−z T μ
≥ min t (4)
αi ≥0,t
2 2
αi FiT Fi αi FiT gi
s.t. 2 i=1
T
2 i=1 0, (5)
i=1 αi gi Fi i=1 (gi − 1)αi + t
2
2
αi FiT Fi In . (6)
i=1
The last inequality actually holds as an equality since one can verify that

A b A b In A−1 b
0, A In =⇒ 0. (7)
bT c bT c bT A−1 bT A−2 b
Denote by (SDP) the SDP relaxation (4)–(6). Let (α1∗ , α2∗ ) be an optimal
solution of (SDP). Then, according to (7), the optimal solution argminz v(D(z))
is recovered by
2 −1 2

z = − αi∗ FiT Fi αi∗ FiT gi . (8)

i=1 i=1
Now, we establish the approximate bound of (SDP).

138 X. Cen et al.
Theorem 1. Let z be the recovered solution (8). Then,

√ 2
1− γ
v(SDP) ≥ max x − z ≥ v(CC) ≥ √
2
√ v(SDP), (9)
x∈Ω 2+ γ
where the parameter γ (0 ≤ γ < 1) is the optimal value of the following univari-
ate concave maximization problem:
−1
γ = sup λg1 2 + (1 − λ)g2 2 − l(λ)T λF1T F1 + (1 − λ)F2T F2 l(λ),
0<λ<1
and l(λ) = λF1T g1 + (1 − λ)F2T g2 . Moreover, suppose both ellipsoids are non-
degenerate, then γ is bounded by the distance between their centers, denoted by
c1 and c2 respectively. That is,
γ ≤ min{σmax
2
(F1 ), σmax
2
(F2 )} · c1 − c2 2 . (10)
Proof. The proof is based on the following approximation bound for nonconvex
quadratic optimization.
Theorem 2 (Theorem 2.3 [12]). Consider the following nonconvex quadratic
optimization problem with k ellipsoid constraints
(EQP) maxx∈Rn g(x) = xT F0 x + 2g0T x
s.t Fi x + gi 2 ≤ 1, i = 1, . . . , k.
Suppose 0n is in the interior of the feasible region of (EQP) and the primal SDP
relaxation of (EQP), denoted by (SDR) has an optimal solution. Then, a feasible
solution x̃ of (EQP) can be generated in polynomial time such that
√ 2
1− γ
g(x̃) ≥ √ √ v(SDR),
r̃ + γ
where γ = maxi=1,...,k gi 2 and

√
8k + 17 − 3
r̃ = min ,n + 1 .
2
In our case, k = 2 and so that r̃ = 2. Using the shifted approach developed in
Theorem 9 in [16], we obtain that

γ = min max F1 x + g1 2 , F2 x + g2 2 , (11)
x
which is further equivalent to

γ = min max λF1 x + g1 2 + (1 − λ)F2 x + g2 2 ,
x 0≤λ≤1

= max min λF1 x + g1 2 + (1 − λ)F2 x + g2 2 ,
0≤λ≤1 x
−1
= sup λg1 2 + (1 − λ)g2 2 − l(λ)T λF1T F1 + (1 − λ)F2T F2 l(λ).
0<λ<1
Since we have assumed gi < 1, i = 1, 2, it follows from the definition (11) that

γ ≤ max F1 0n + g1 2 , F1 0n + g2 2 = max g1 2 , g2 2 < 1.
Now we show (10). For the two centers c1 and c2 , we have Fi ci + gi = 0 for
i = 1, 2. It implies from the definition (11) that
√
γ ≤ max {F1 c1 + g1 , F2 c1 + g2 } = F2 c2 + g2 + F2 (c1 − c2 )
≤ F2 c2 + g2 + σmax (F2 )c1 − c2 = σmax (F2 )c1 − c2 .
Similarly, we have
√
γ ≤ σmax (F1 )c1 − c2 .
The proof of (10) is complete.
Remark 1. The approximation bound (9) is not tight. Interestingly, when γ = 0,
one can prove that v(CC) = v(SDP).
3 Globally Solve (CC) in Polynomial Time

We develop a global optimization method for solving (CC) and show it can be
done in polynomial time.
The inner maximization optimization problem of (CC) is known as Celis-
Dennis-Tapia (CDT) subproblem in literature. Recently, there are some pro-
gresses in the CDT problem:
(CDT) min xT Q0 x + 2q0T x + γ0
s.t xT Qi x + 2qiT x + γi ≤ 0, i = 1, 2,
where Q1 0, Q2 0 and q2 is in the range space of Q2 . When n = 2, (CDT)
can be efficiently solved by adding valid second order-cone inequalities into its
semidefinite relaxation [6,17,19]. Generally, the first time to show that (CDT) is
polynomial solvable is due to Bienstock [5]. The polynomial solvability of (CDT)
was independently established in [9] under the assumption Q2 0. However,
except for theoretical contributions, both algorithms are actually impractical.
The first practical polynomial-time algorithm is due to Sakaue et al. [15]. Their
approach is to find all KKT points of (CDT) by solving eigenvalue problems.
More precisely, let μ1 and μ2 be the Lagrangian multipliers of the constraints of
(CDT), respectively. The difficulty of (CDT) occurs when both μ1 and μ2 are
nonzero. In this case, μ1 and μ2 are shown to be the solutions of two determi-
nantal equations:
detB(μ1 ) = det((D2 ⊗ C1 − C2 ⊗ D2 ) + μ1 (D2 ⊗ D1 − D1 ⊗ D2 )) = 0,
detB(μ2 ) = det((C1 ⊗ D1 − D1 ⊗ C2 ) + μ2 (D2 ⊗ D1 − D1 ⊗ D2 )) = 0,
where ⊗ is the Kronecker product, and
⎛ ⎞ ⎛ ⎞
Qi −Q0 qi O −Qi 0n
Ci := ⎝ −Q0 O q0 ⎠ , Di := ⎝ −Qi O −qi ⎠ , i = 1, 2.
qiT q0T γi 0Tn −qiT 0
140 X. Cen et al.
Solving the above determinantal equations are reduced to find the generalized
zero eigenvalues. Therefore, the computational complexity of the global algo-
rithm for solving CDT problem is at most O(n6 log log u−1 ), where u is a unit
roundoff.
Now, we focus on solving (CC), which is an unconstrained optimization prob-
lem in terms of z. Since the optimal z-solution is clearly in the interior of Ω, we
can add a redundant constraint to (CC):

v(CC) =minz f (z) := z T z + max{xT x − 2z T x} (12)
x∈Ω
n
s.t. z ∈ Q := {z ∈ R : f¯(z) := F1 z + g1 2 − 1 ≤ 0}.
One can see that the convex feasible region Q is bounded, closed and has
nonempty interior. The objective function f (z) is nonsmooth but convex. For
any given point z, let x∗ (z) be an optimal solution of maxx∈Ω {xT x − 2z T x},
which is solved as a CDT subproblem. Then, a subgradient of f (z) at any point
z is given by
g(z) = 2z − 2x∗ (z). (13)
We employ the ellipsoid method to solve the nonsmooth convex problem
(12). The algorithmic framework as shown in [14] is presented in the following,
where zc := −(F1T F1 )−1 F1T g is the center of the ellipsoid Q and ḡ(z) denotes
the gradient of f¯(z).
Ellipsoid method
0. Let y0 := zc and R := σmin1(F1 ) . Clearly, {y ∈ Rn : y − y0 ≤ R} ⊇ Q.

Initialize H0 = R2 · In and the iteration counter k = 0.
1. While the stopping criterion is not reached, do

g(yk ) if yk ∈ Q, 1 H g
gk = , yk+1 = yk − k k ,
ḡ(yk ) if yk ∈
/ Q, n + 1 gT H g
k k k

n2 n Hk gk gkT Hk
Hk+1 = 2 Hk − , k := k + 1.
n −1 n + 1 gT H g
k k k
The following convergence result of the above ellipsoid method for solving
nonsmooth convex optimization problem minz∈Q f (z) (whose optimal value and
optimal solution are denoted by f ∗ and z ∗ , respectively) can be found in [14].
Theorem 3. (Theorem 3.2.8, [14]). Let f(z) be Lipschitz continuous on ball {z ∈
Rn : z − z ∗ ≤ R} with some constant M . Assume that there exists some ρ > 0
and z̄ ∈ Q such that {z ∈ Rn : z − z̄ ≤ ρ} ⊆ Q, then for any
k > 2(n + 1)2 ln(R/ρ),

Q ∩ {y0 , y1 , . . . , yk } = ∅ and
1 − k
min f (yi ) − f ∗ ≤ M R2 · e 2(n+1)2 .
0≤j≤k,yj ∈Q ρ
As a corollary of Theorem 3, we establish the corresponding convergence

result for (12). It is not difficult to verify that {z ∈ Rn : z−zc ≤ σmax1(F1 ) } ⊆ Q.
So, in our case, ρ = σmax1(F1 ) . For any z ∈ B ∗ := {z ∈ Rn : z − z ∗ ≤ R}, it
follows from (13) and the fact z ∗ ∈ Q that
g(z) ≤ 2(z + x∗ (z)) ≤ 2{(R + z ∗ ) + (R + zc )}

≤ 2{(R + R + zc ) + (R + zc )}
= 6R + 4zc .
Therefore, M := 6R + 4zc is the Lipschitz constant of f (z) over B ∗ .

Now, according to Theorem 3 and the complexity for solving (CDT), we have
the following complexity result for (CC).
Corollary 1. For any > 0, the ellipsoid method can find a global -
approximation optimal solution of (CC) with the complexity at most

6R + 4zc 1
O(n8 log log u−1 ) log .
σmin
2 (F ) σ
1 max (F1 )
4 Generate CDT Subproblem with Positive Duality Gap

In this section, with the help of (CC), we can randomly generate CDT subprob-
lems having positive Lagrangian duality gap with high probability.
During a long-term study of CDT subproblem, only few instances with pos-
itive duality gap were reported in literature, see for example, [1,7,8,17,18,20].
Ai and Zhang [1] proposed an easy-to-check necessary and sufficient condition
for the strong duality between CDT subproblem and its Lagrangian dual. Their
condition suggested that generally there seems to be very few CDT subproblems
having positive duality gap. Their numerical results demonstrated that there are
87 instances without duality gap for 90 random instances with dimension lying
between 1 and 30.
Since the inner maximization problem of (CC) is already a CDT subprob-
lem, we can generate a CDT subproblem with positive duality gap if the SDP
relaxation of (CC) is not tight.
Proposition 1. If v(SDP) > v(CC), then the CDT subproblem (QP(z ∗ )) (2)
has a positive duality gap, where z ∗ is the optimal solution of (CC).
Proof. According to the definitions of (QP(z)) (2) and (D(z)) (3), we have
v(CC) = v(QP(z ∗ )) = min v(QP(z)) ≤ min v(D(z)) ≤ v(D(z ∗ )),

z z
142 X. Cen et al.
where the first inequality follows from weak duality and the second inequality
holds trivially.
Under the assumption v(SDP) > v(CC), it follows from the above chain of
inequalities and the definition v(SDP) = minz v(D(z)) that
v(QP(z ∗ )) < v(D(z ∗ )).
The proof is complete since (D(z ∗ )) is the Lagrangian dual problem of the CDT
subproblem (QP(z ∗ )).
We tested 1000 instances of (CC) in two and three dimensions, respectively,
where each component of the input Fi and gi (i = 1, 2) is randomly, indepen-
dently and uniformly generated in {0, 0.01, 0.02, · · · , 0.99, 1}. v(CC) and v(SDP)
are solved by the ellipsoid method in Sect. 3 and the solver CVX [11], respec-
tively. To our surprise, among the 1000 two-dimensional instances, there are
766 instances satisfying v(SDP) > v(CC). While for the 1000 three-dimensional
instances, the number of instances satisfying v(SDP) > v(CC) is 916. It implies
that, with the help of (CC) and Proposition 1, one can generate CDT subproblem
admitting positive duality gap with a high probability.
Finally, we illustrate two small examples of (CC) and the corresponding CDT
subproblems (QP(z ∗ )). For each example, we plot in Fig. 1 the exact Chebyshev
center and the
corresponding SDP approximation, the smallest covering circle
with
radius v(CC) and the approximated circle via SDP relation whose radius
is v(SDP). One can observe that the smaller the distance between the centers
of the two input ellipses, the tighter the SDP relaxation. It demonstrated the
relation (10) in Theorem 1.
Example 1. Let

10 0.94 0.01 0.88 0.51
n = 2, F1 = , g1 = , F2 = , g2 = .
01 0.19 0.72 0.39 0.15
We can calculate v(CC) = 0.8044, z ∗ = (−0.5956, −0.2890)T and v(SDP) = 1.

√ 2
1− γ
The worst-case approximation ratio of (SDP) is √2+√γ = 0.0982. The CDT
subproblem (QP(z ∗ )) has a positive duality gap, which is equal to
v(D(z ∗ )) − v(QP(z ∗ )) = 1.2705 − 0.8044 = 0.4661.
Example 2. Let

0.35 0.91 0.45 0.47 0.69 0.15
n = 2, F1 = , g1 = , F2 = , g2 = .
0.40 0.40 0.32 0.89 0.66 0.87
We have v(CC) = 10.2672, z ∗ = (−0.9975, −0.1392)T and v(SDP) = 10.8632.

√ 2
1− γ
The worst-case approximation ratio of (SDP) is √2+√γ = 0.2170. The CDT
subproblem (QP(z ∗ )) has a positive duality gap, which is equal to
v(D(z ∗ )) − v(QP(z ∗ )) = 10.9235 − 10.2672 = 0.6563.
0.5 3
2
0
+
* 1
-0.5
0 *+
-1 -1
-2
-1.5
-3
-1.5 -1 -0.5 0 0.5 1 1.5 -5 -4 -3 -2 -1 0 1 2 3 4
Fig. 1. Two examples in two dimension where the input ellipses are plotted in solid line.
The dotted and dashed circles are the Chebyshev solutions and the SDP approximation,
respectively. Chebyshev centers and the corresponding SDP approximation are marked
by ∗ and +, respectively.
Acknowledgments. This research was supported by National Natural Science Foun-

dation of China under grants 11822103, 11571029, 11771056 and Beijing Natural Sci-
ence Foundation Z180005.
References
1. Ai, W., Zhang, S.: Strong duality for the CDT subproblem: a necessary and suffi-
cient condition. SIAM J. Optim. 19(4), 1735–1756 (2009)
2. Beck, A.: Convexity properties associated with nonconvex quadratic matrix func-
tions and applications to quadratic programming. J. Optim. Theory Appl. 142(1),
1–29 (2009)
3. Beck, A., Eldar, Y.: Strong duality in nonconvex quadratic optimization with two
quadratic constraints. SIAM J. Optim. 17(3), 844–860 (2006)
4. Beck, A., Eldar, Y.: Regularization in regression with bounded noise: a Chebyshev
center approach. SIAM J. Matrix Anal. Appl. 29(2), 606–625 (2007)
5. Bienstock, D.: A note on polynomial solvability of the CDT problem. SIAM J.
Optim. 26(1), 488–498 (2016)
6. Burer, S.: A gentle, geometric introduction to copositive optimization. Math. Pro-
gram. 151(1), 89–116 (2015)
7. Burer, S., Anstreicher, K.M.: Second-order-cone constraints for extended trust-
region subproblems. SIAM J. Optim. 23(1), 432–451 (2013)
8. Chen, X., Yuan, Y.: On local solutions of the Celis-Dennis-Tapia subproblem.
SIAM J. Optim. 10(2), 359–383 (2000)
9. Consolini, L., Locatelli, M.: On the complexity of quadratic programming with two
quadratic constraints. Math. Program. 164(1–2), 91–128 (2017)
10. Eldar, Y., Beck, A.: A minimax Chebyshev estimator for bounded error estimation.
IEEE Trans. Signal Process. 56(4), 1388–1397 (2008)
11. Grant, M., Boyd, S.: CVX: Matlab software for disciplined convex programming
error estimation, version 2.1. (March 2014). http://cvxr.com/cvx
12. Hsia, Y., Wang, S., Xu, Z.: Improved semidefinite approximation bounds for non-
convex nonhomogeneous quadratic optimization with ellipsoid constraints. Oper.
Res. Lett. 43(4), 378–383 (2015)
144 X. Cen et al.
13. Milanese, M., Vicino, A.: Optimal estimation theory for dynamic systems with set
membership uncertainty: an overview. Automatica 27(6), 997–1009 (1991)
14. Nesterov, Y.: Introductory Lectures on Convex Optimizaiton: A Basic Course.
Kluwer Academic, Boston (2004)
15. Sakaue, S., Nakatsukasa, Y., Takeda, A., Iwata, S.: Solving generalized CDT prob-
lems via two-parameter eigenvalues. SIAM J. Optim. 26(3), 1669–1694 (2016)
16. Xia, Y., Yang, M., Wang, S.: Chebyshev center of the intersection of balls: com-
plexity, relaxation and approximation (2019). arXiv:1901.07645
17. Yang, B., Burer, S.: A two-variable approach to the two-trust-region subproblem.
SIAM J. Optim. 26(1), 661–680 (2016)
18. Ye, Y., Zhang, S.: New results on quadratic minimization. SIAM J. Optim. 14(1),
245–267 (2003)
19. Yuan, J., Wang, M., Ai, W., Shuai, T.: New results on narrowing the duality gap of
the extended Celis-Dennis-Tapia problem. SIAM J. Optim. 27(2), 890–909 (2017)
20. Yuan, Y.: On a subproblem of trust region algorithms for constrained optimization.
Math. Program. 47(1–3), 53–63 (1990)
On Conic Relaxations of Generalization
of the Extended Trust Region
Subproblem
Rujun Jiang1(B) and Duan Li2

1
School of Data Science, Fudan University, Shanghai, China
rjjiang@fudan.edu.cn
2
School of Data Science, City University of Hong Kong, Hong Kong, China
dli226@cityu.edu.hk
Abstract. The extended trust region subproblem (ETRS) of minimiz-

ing a quadratic objective over the unit ball with additional linear con-
straints has attracted a lot of attention in the last few years due to its the-
oretical significance and wide spectra of applications. Several sufficient
conditions to guarantee the exactness of its semidefinite programming
(SDP) relaxation or second order cone programming (SOCP) relaxation
have been recently developed in the literature. In this paper, we consider
a generalization of the extended trust region subproblem (GETRS), in
which the unit ball constraint in ETRS is replaced by a general, pos-
sibly nonconvex, quadratic constraint. We demonstrate that the SDP
relaxation can further be reformulated as an SOCP problem under a
simultaneous diagonalization condition of the quadratic form. We then
explore several sufficient conditions under which the SOCP relaxation of
GETRS is exact under Slater condition.
1 Introduction
We consider the following quadratically constrained quadratic programming
(QCQP) problem,
1
(P0 ) min z T Cz + cT z
2
1
s.t. z T Bz + bT z + e ≤ 0, (1)
2
AT z ≤ d,
where C and B are n×n symmetric matrices, not necessary positive semidefinite,
A is an n × m matrices, c, b ∈ Rn , e ∈ R and d ∈ Rm . Problem (P0 ) is
nonconvex since both the quadratic objective and the quadratic constraint may
Supported by Shanghai Sailing Program 18YF1401700, Natural Science Foundation
of China (NSFC) 11801087 and Hong Kong Research Grants Council under Grants
14213716 and 14202017.
https://doi.org/10.1007/978-3-030-21803-4_15
146 R. Jiang and D. Li
be nonconvex. In fact, problem (P0 ) is NP-hard even when there is no quadratic

constraint [21].
When there are no linear constraints and the quadratic constraint (1) is a
unit ball constraint, problem (P0 ) reduces to the classical trust region subprob-
lem (TRS). The TRS first arises in the trust region method for unconstrained
optimization problems [7], and also admits important applications in robust opti-
mization [1]. Various methods have been developed to solve the TRS [10,20,23].
When there are no additional linear constraints, problem (P0 ) reduces to the
generalized trust region subproblem (GTRS), which is also a well studied subject
in the literature [2,3,9,14,15,17,19,26,27].
When the quadratic constraint (1) reduces to a unit ball constraint, problem
(P0 ) is termed the extended trust region subproblem (ETRS), which has recently
attracted much attention in the literature [5,6,8,11–13,18,27,30]. The ETRS is
nonconvex and semidefinite programming (SDP) relaxation has been a widely
used technique for solving the ETRS. However, the SDP relaxation is often
not tight enough and consequently only offers a lower bound, even for the case
with m = 1 [27]. Jeyakumar and Li [13] first provided the following dimension
condition under which the SDP relaxation is exact,
dim Ker(C − λmin (C)In ) ≥ dim span{a1 , . . . , am } + 1,
where λmin (C) stands for the minimal eigenvalue of C and [a1 , . . . , am ] = A, and
showed its immediate application in robust least squares and a robust SOCP
model problem. Hsia and Sheu [12] derived a more general sufficient condition,
rank[C − λmin (C)In , a1 , . . . , am ] ≤ n − 1.
After that, using KKT conditions of the SDP relaxation (in fact, an equiva-
lent SOCP relaxation) of the ETRS, Locatelli [18] presented a better sufficient
condition than [12], which corresponds to the solution conditions of a specific
linear system. Meanwhile, Ho-Nguyen and Kilinc-Karzan [11] also developed a
sufficient condition by identifying the feasibility of a linear system. In fact, the
two conditions in [11,12] are equivalent for the ETRS as stated in [11].
In this paper, we mainly focus on a generalization of ETRS (GETRS), which
replaces the unit ball constraint in ETRS with a general, possibly nonconvex,
quadratic constraint. To the best of our knowledge, the current literature lacks
study on the equivalence between the GETRS and its SDP relaxation. Our study
in this paper on the equivalence between the GETRS and its SDP relaxation is
motivated not only by wide applications of the GETRS, but also by its theoretical
implication to a more general class of QCQP problems. The GETRS is much
more difficult than ETRS as the feasible region of the GETRS is no longer
compact and the optimal solution may be unattainable in some cases and the
null space of C + uB in the GETRS is more complicated than that in the ETRS,
where u is the corresponding KKT multiplier of constraint (1). To introduce our
investigation of sufficient conditions when the SDP relaxation is exact, we first
define the set IP SD = {λ : C + λB 0}, which is in fact an interval [19]. Define
IP+SD = IP SD R+ , where R+ is the nonnegative orthogonal axis. We then focus
On Conic Relaxations of Generalization of the Extended 147
the condition that the set IP SD has a nonempty interior. We mainly show that
under this condition the SDP relaxation is equivalent to an SOCP reformulation.
We then derive sufficient conditions under which the SDP relaxation of problem
(P) is tight.
Notation For any index set J, we define AJ as the restriction of matrix A to
the rows indexed by J and vJ as the restriction of vector v to the entries indexed
by J. We denote by the notation J C the complementary set of J. The notation
v denotes the Euclidean norm of vector v. We use Diag(A) and diag(a) to
denote the vector formed by the diagonal entries of matrix A and the diagonal
matrix formed by vector a, respectively. And v(·) represents the optimal value
of problem (·). We use Null(A) to denote the null space of matrix A.
2 Optimality Conditions
In this section, to simplify our problem, we consider the case when Slater condi-
tion of the SDP relaxationholds and further show a sufficient exactness condition
of the SDP relaxation when IP+SD has a nonempty interior.
In this section, we consider the case IP+SD has a nonempty interior, which is
also known as the regular condition in the study of the GTRS [19,26]. In fact,
int(IP+SD ) = ∅ implies that the two matrices C and B are SD [28]. That is,
there exists a nonsingular matrix U such that U T CU and U T BU both become
diagonal matrices. Then problem (P0 ) can then be reformulated, via a change
of variables z = U x, as follows,
n
n

1
(P) min δi x2i + εi xi
i=1
2 i=1
n n
1
s.t. αi x2i + βi xi + e ≤ 0,
i=1
2 i=1
ĀT x ≤ d,
where δ = Diag(U T CU ), α = Diag(U T BU ), ε = U T c, β = U T b and Ā = U T A.

By invoking augmented variables yi = x2i and relaxing to yi ≥ x2i , we have the
following SOCP relaxation,
n
n

1
(SOCP) min δi yi + εi xi
i=1
2 i=1
n n
1
s.t. αi yi + βi xi + e ≤ 0,
i=1
2 i=1
ĀT x ≤ d,
x2i ≤ yi , i = 1, . . . , n.
The equivalence of (SOCP) and (SDP) is obvious and thus we only need to focus
on identifying the exactness of (SOCP).
It is well known that under Slater condition any optimal solution of convex
problems must satisfy KKT conditions [4]. This fact enables us to find sufficient
conditions that guarantee the exactness of the SDP relaxation. Let us denote the
jth column of matrix Ā by aj . Then the KKT conditions of the convex problem
(SOCP) are given as follows:
1
2 (δi + uαi ) − wi = 0, i = 1 . . . , n,
m
εi + uβi + j=1 vj aji + wi xi = 0, i = 1 . . . , n,
n 1 n
i=1 2 αi yi + i=1 βi xi + e ≤ 0,
j T
(ā ) x ≤ dj , j = 1, . . . , m,
x2i ≤ yi , n i = 1, . . . , n, (2)
n
u( i=1 12 αi yi + i=1 βi xi + e) = 0
vj ((āj )T x − dj ) = 0 j = 1, . . . , m,
wi (x2i − yi ) = 0, i = 1, . . . , n,
u, vj , wi ≥ 0 j = 1, . . . , m, i = 1, . . . , n,
n n
where u is the KKT multiplier of the constraint i=1 12 αi yi + i=1 βi xi + e ≤ 0,
vj is the KKT multiplier of the constraint (āj )T x ≤ dj , j = 1, . . . , m, and wi is
the KKT multiplier of the constraint x2i ≤ yi , i = 1, . . . , n.
The following lemma shows that the SDP relaxation is always bounded from
below and the optimal solution is attainable if int(IP+SD ) = ∅ and problem (P)
is feasible, which is weaker than Slater condition of the original problem (P).
Lemma 1 If int(IP+SD ) = ∅ and problem (P) is feasible, then the SDP relaxation
of (P) is bounded from below and the optimal value is attainable.
Proof. Consider the following Lagrangian dual problem of (P) ([24,25]), which
is also the conic dual problem of (SDP),
(L) max − τ /2 + ue − dT v
u,v,τ

C + uB ε + uβ + Av
s.t. M := 0.
(ε + uβ + Av)T τ
u ≥ 0, v ≥ 0.
Since int(IP+SD ) = ∅, we can always find some (v, τ ) such that the matrix M
is positive semidefinite for any u ∈ int(IP+SD ). In fact, for any u ∈ int(IP+SD )
we have C + uB 0 and thus ∃τ ≥ 0 such that M 0 for every v ≥ 0, e.g.,
τ = (ε + uβ + Av)T (C + uB)−1 (ε + uβ + Av) + 1. This means (τ, u, v) satisfies
Slater condition for problem (L). As Slater condition of problem (SDP ) implies
its feasibility, we have v(SDP) ≤ +∞. And problem (L) is bounded from above
due to weak duality, i.e., v(D) ≤ v(SDP). Hence from strong duality, the optimal
value of the SDP relaxation is equivalent to problem (L) and the objective value
is attainable [4].
For any u ∈ IP+SD , let us define J(u) = {i : δi + uαi = 0, i = 1, . . . , n}. We

will use J instead of J(u) for simplicity if it does not cause any confusion. So
we have AJ = [a1J , . . . , am
J ], where the superscribe means the column index. We
next show a sufficient condition, which is a generalization of the result in [18],
to guarantee the exactness of the SDP relaxation.
Condition 2 The interior of IP+SD is not empty. For any u ∈ ∂IP+SD , if J = ∅,
then {v : εJ + uβJ + ĀJ v = 0} ∩ Rm
+ = ∅.
Theorem 3 Assume that Slater condition holds for problem (SOCP). If Condi-
tion 2 holds, the SDP relaxation is exact and the optimal values of both the SDP
relaxation and problem (P) are attainable.
Proof. From Lemma 1, we obtain that (SOCP) is bounded from below and
the optimal solution is attainable. Then due to Slater condition, every optimal
solution of (SOCP) must be a KKT solution of system (2). So we have the
following two cases:
1. If u ∈ ∂IP+SD , then either J = ∅ or J = ∅. For the first case, 21 (δi +uαi )−wi =
0 implies that wi = 12 (δi + uαi ) > 0. This, together with complementary
slackness wi (x2i − yi ) = 0, implies that x2i = yi , i.e., (SOCP) is already exact.
For the latter case, the KKT condition 12 (δi + uαi ) − wi = 0, i = 1, . . . , n,
implies that wi = 0, ∀i ∈ J. But Condition 2 shows {v : J + uβJ + ĀJ v =
0} ∩ Rm + = ∅, i.e., there is no KKT solution satisfying the second equations
in (2) in this case.
2. Otherwise, u ∈ int(IP+SD ) and wi = 12 (δi + uαi ) > 0 for all u ∈ int(IP+SD ).
By the complementary slackness wi (x2i − yi ) = 0, we have x2i − yi = 0, ∀i =
1, . . . , n, and thus the SOCP relaxation is exact.
As the optimal value of SDP relaxation is attainable, so is problem (P).
Let us consider now the following illustrative example In this problem, IP+SD =
[1, 2] is an interval and ∂IP+SD = {1, 2}. One may check that Condition 2
is satisfied. The optimal value of the SDP relaxation
is −2.44082 with x =
T 1.6660 1.0534
(−1.2907 − 0.8161) and X = . It is easy to verify that
1.0534 0.6660
X = xxT and the SDP relaxation is exact.
Motivated by the perturbation condition in Theorem 3.1 in [18], we propose
the following condition to extend Condition 2.
Condition 4 The interior of IP+SD . For any u ∈ ∂IP+SD , if J = ∅, then ∀ > 0,
∃ η ∈ RJ such that η ≤ and {v : εJ + η + uβJ + ĀJ v = 0} ∩ Rm+ = ∅.
Condition 4 will also guarantee the exactness of the SDP relaxation under the
same mild assumptions as in Theorem 3.
Theorem 5 Assume that Slater condition holds for problem (SOCP). If Condi-
tion 4 holds, the SDP relaxation is exact and the optimal values of SDP relaxation
and problem (P) both are attainable.
Detailed proof please see our working paper [16].

Remark 6 When B reduces to the identical matrix, problem (P) reduces to the
ETRS and an exactness condition is given in [18]. The difficulty in our proof,
compared to the results in [18], mainly comes from the possible non-compactness
of the feasible region.
Now let us consider another illustrative example,
(P2 ) min − x21 + 2x22

x21 − x22 ≤ 1,
x1 + x2 ≤ 1.
In the above example, IP+SD = [1, 2] is an interval and ∂IP+SD = {1, 2}. It is easy
to verify that Condition 2 is not fulfilled but
√ Condition 4 is fulfilled for any > 0
and η = t(1 1)T , where t ∈ R and t ≤ 2 /2. The optimal value of the SDP
T 10
relaxation is −1 with x = (−1 0) and X = . So we have X = xxT and
00
the SDP relaxation is exact.
In fact, Condition 2 holds if and only if the following linear programming
problem has no solution,
(LP) min 0
v
s.t. εJ + uβJ + ĀJ v = 0,
v ≥ 0.
The duality of problem (LP) is
(LD) max −(εJ + uβJ )T y

y
s.t. ĀTJ y ≤ 0.
We show in the following lemma a nontrivial observation from strong duality of

linear programming. The proof is similar to Proposition 3.3 in [18] but we give
one here for completeness.
Lemma 7 Condition 2 is fulfilled if and only if (LD) is unbounded from above,
and when Condition 2 fails, Condition 4 holds if and only if (LD) has multiple
optimal solutions.
Proof. The first statement follows directly from the infeasibility of (LP) and
strong duality. Condition 4 is equivalent to that ∀ > 0, ∃ η ∈ RJ with η ≤
and {v : εJ + η + uβJ + ĀJ v = 0} ∩ Rm+ = ∅, i.e.,
(LP ) min 0
v
s.t. εJ + η + uβJ + ĀJ v = 0,
v ≥ 0.
The duality of problem (LP ) is
(LD ) max −(εJ + η + uβJ )T y

y
s.t. ĀTJ y ≤ 0.
Similarly, we conclude that (LD ) is unbounded from above
The above lemma shows: Condition 2 holds if and only if (LD) is unbounded
from above, which is equivalent to that there exists a nonzero ȳ such that ĀTJ ȳ ≤
0 and −(εJ + uβJ )T ȳ > 0. And thus by defining ỹ = kȳ, we have ĀTJ ỹ ≤ 0 and
−(εJ + uβJ )T ỹ → ∞ as k → ∞; On the other hand, when Condition 2 fails,
Condition 4 holds if and only if there exists a nonzero ȳ such that ĀTJ ȳ ≤ 0 and
−(εJ + uβJ )T ȳ = 0. The above two statements can be simplified as: There exists
a nonzero ȳ such that ĀTJ ȳ ≤ 0 and (εJ + uβJ )T ȳ ≤ 0. Assume θ ∈ Rn with
θJ C = 0, θJ = ȳ. Then we hvae ĀT θ = ĀTJ y ≤ 0, (ε + uβ)T θ = (εJ + uβJ )T ȳ ≤
0, which is also equivalent to, by defining θ = U z, that ∃z ∈ Rn such that
(C + uB)z = 0, AT z ≤ 0 and (ε + uβ)T z ≤ 0. (Note that U is the congruent
matrix such that U T CU = diag(δ) and U T BU = diag(α) as mentioned in the
beginning of this section.) The above implication suggests that Conditions 2 and
4 can be combined as the following condition.
Condition 8 For any u ∈ ∂IP+SD , if Null(C + uB) = ∅, there exists a nonzero
z ∈ Rn such that (C + uB)z = 0, AT z ≤ 0 and (ε + uβ)T z ≤ 0.
Then we can summarise our main result under the condition int(IP+SD ) = ∅ in
the following theorem.
Theorem 9 When int(IP+SD ) = ∅ and Condition 8 holds, the SDP relaxation is
exact. Moreover, if Slater condition holds, both problem (P) and its SDP relax-
ation are bounded from below and the optimal values are attainable.
An advantage of Condition 8 is that it can be directly checked by the original
data set, i.e., we do not need invoke the congruence transformation to get the SD
form. In particular, when the quadratic constraint reduces to the unit ball con-
straint, problem (P) reduces to the ETRS, and Condition 8 reduces to Condition
2.1 in [11], i.e., there exists a nonzero vector z such that(C + λmin (C)I)z = 0,
AT z ≤ 0 and cT z ≤ 0; Conditions 2 and 4 reduces to (13) and (14) in [18]. As a
result, for problem (BP), Condition 2.1 in [11] is equivalent to (13) and (14) in
[18], which was also indicated in [11].
Together with the fulfillment of Slater condition, we further have the following
S-lemma with linear inequalities.
Theorem 10 (S-lemma with linear inequalities) Assume that there exists (X, x)
such that 12 B • X + bT x + e ≤ 0, AT x ≤ d and X xxT , int(IP+SD ) = ∅ and
Condition 8 holds. Then the following two statements are equivalent:
(i) 12 xT Bx + bT x + e ≤ 0 and AT x ≤ d ⇒ 12 xT Cx + cT x + γ ≥ 0.
(ii) ∃u, v1 , . . . , vm ≥ 0, ∀x ∈ Rn , 12 xT Cx + cT x + f + u( 12 xT Bx + bT x + e) +
v T (AT x − d) ≥ 0.
Proof. It is obvious that (ii) ⇒ (i). Next let us prove (i) ⇒ (ii). From Theorem 9,
we obtain that the SDP relaxation is bounded from below. So the SDP relaxation
is equivalent to the Lagrangian duality of problem (P) [24]. Hence,
1 T 1
max min L(x, u, v) := x Cx + cT x + u( xT Bx + bT x + e) + v T (Ax − d)
u≥0,v≥0 x 2 2
= v(SDP)
= v(P)
1 1
= min{ xT Cx + cT x : xT Bx + bT x + e ≤ 0, Ax ≤ d}.
x 2 2
Thus, min{ 12 xT Cx + cT x : 12 xT Bx + bT x + e ≤ 0, Ax ≤ d} ≥ −γ is equivalent

x
to
1 T 1
max min L(x, u, v) := x Cx+cT x+u( xT Bx+bT x+e)+v T (Ax−d) ≥ −γ.
u≥0,v≥0 x 2 2
The latter statement implies that ∃u, v1 , . . . , vm ≥ 0, ∀x ∈ Rn , 12 xT Cx + cT x +

γ + u( 12 xT Bx + bT x + e) + v T (Ax − d) ≥ 0, which is exactly statement ii).
Remark 11. The classical S-lemma, which first proposed by Yakubovich [29],
and its variants have a lot of applications in the real world, see the survey paper
[22]. To the best of our knowledge, our S-lemma is the most general one with
linear constraints, while the S-lemma in Jeyakumar and Li [13] is confined to a
unit ball constraint.
3 Conclusions
In this paper, we investigate sufficient conditions to guarantee the exactness of

the SDP relaxation for the GETRS. Our main contribution is to propose dif-
ferent sufficient conditions to guarantee the exactness under a regular condition
and Slater condition, based on the KKT system for the SDP relaxation of the
GETRS.
In fact, when the quadratic constraint becomes an equality in problem (P),
our sufficient conditions still guarantee, albeit with a slight modification, the
exactness of the SDP relaxation. Since the technique is similar, we omit the
details for the case of problem (P) with an equality quadratic constraint. For
future research directions, we will investigate more general sufficient conditions
to guarantee the exactness of the SDP relaxation and extend our sufficient con-
ditions in this paper to a wider class of QCQP problems.
References
1. Ben-Tal, A., El Ghaoui, L., Nemirovski, A.: Robust optimization. Princeton Uni-
versity Press (2009)
2. Ben-Tal, A., den Hertog, D.: Hidden conic quadratic representation of some non-
convex quadratic optimization problems. Math. Program. 143(1–2), 1–29 (2014)
3. Ben-Tal, A., Teboulle, M.: Hidden convexity in some nonconvex quadratically con-
strained quadratic programming. Math. Program. 72(1), 51–63 (1996)
4. Boyd, S., Vandenberghe, L.: Convex optimization. Cambridge University Press
(2004)
5. Burer, S., Anstreicher, K.M.: Second-order-cone constraints for extended trust-
region subproblems. SIAM J. Optim. 23(1), 432–451 (2013)
6. Burer, S., Yang, B.: The trust region subproblem with non-intersecting linear con-
straints. Math. Program. 149(1–2), 253–264 (2015)
7. Conn, A.R., Gould, N.I., Toint, P.L.: Trust Region Methods, vol. 1. Society for
Industrial and Applied Mathematics (SIAM), Philadelphia (2000)
8. Fallahi, S., Salahi, M., Karbasy, S.A.: On SOCP/SDP formulation of the extended
trust region subproblem (2018). arXiv:1807.07815
9. Feng, J.M., Lin, G.X., Sheu, R.L., Xia, Y.: Duality and solutions for quadratic
programming over single non-homogeneous quadratic constraint. J. Glob. Optim.
54(2), 275–293 (2012)
10. Hazan, E., Koren, T.: A linear-time algorithm for trust region problems. Math.
Program. 1–19 (2015)
11. Ho-Nguyen, N., Kilinc-Karzan, F.: A second-order cone based approach for solving
the trust-region subproblem and its variants. SIAM J. Optim. 27(3), 1485–1512
(2017)
12. Hsia, Y., Sheu, R.L.: Trust region subproblem with a fixed number of additional
linear inequality constraints has polynomial complexity (2013). arXiv:1312.1398
13. Jeyakumar, V., Li, G.: Trust-region problems with linear inequality constraints:
exact SDP relaxation, global optimality and robust optimization. Math. Program.
147(1–2), 171–206 (2014)
14. Jiang, R., Li, D.: Novel reformulations and efficient algorithm for the generalized
trust region subproblem (2017). arXiv:1707.08706
15. Jiang, R., Li, D.: A linear-time algorithm for generalized trust region problems
(2018). arXiv:1807.07563
16. Jiang, R., Li, D.: Exactness conditions for SDP/SOCP relaxations of generalization
of the extended trust region subproblem. Working paper (2019)
17. Jiang, R., Li, D., Wu, B.: SOCP reformulation for the generalized trust region sub-
problem via a canonical form of two symmetric matrices. Math. Program. 169(2),
531–563 (2018)
18. Locatelli, M.: Exactness conditions for an SDP relaxation of the extended trust
region problem. Optim. Lett. 10(6), 1141–1151 (2016)
19. Moré, J.J.: Generalizations of the trust region problem. Optim. Methods Softw.
2(3–4), 189–209 (1993)
20. Moré, J.J., Sorensen, D.C.: Computing a trust region step. SIAM J. Sci. Stat.
Comput. 4(3), 553–572 (1983)
21. Pardalos, P.M.: Global optimization algorithms for linearly constrained indefinite
quadratic problems. Comput. Math. Appl. 21(6), 87–97 (1991)
22. Pólik, I., Terlaky, T.: A survey of the s-lemma. SIAM Rev. 49(3), 371–418 (2007)
23. Rendl, F., Wolkowicz, H.: A semidefinite framework for trust region subproblems
with applications to large scale minimization. Math. Program. 77(1), 273–299
(1997)
24. Shor, N.Z.: Quadratic optimization problems. Sov. J. Comput. Syst. Sci. 25(6),
1–11 (1987)
25. Shor, N.: Dual quadratic estimates in polynomial and boolean programming. Ann.
Oper. Res. 25(1), 163–168 (1990)
26. Stern, R.J., Wolkowicz, H.: Indefinite trust region subproblems and nonsymmetric
eigenvalue perturbations. SIAM J. Optim. 5(2), 286–313 (1995)
27. Sturm, J.F., Zhang, S.: On cones of nonnegative quadratic functions. Math. Oper.
Res. 28(2), 246–267 (2003)
28. Uhlig, F.: Definite and semidefinite matrices in a real symmetric matrix pencil.
Pac. J. Math. 49(2), 561–568 (1973)
29. Yakubovich, V.A.: S-procedure in nonlinear control theory. Vestnik Leningrad Uni-
versity, vol. 1, pp. 62–77 (1971)
245–267 (2003)
On Constrained Optimization Problems
Solved Using the Canonical Duality
Theory
Constantin Zălinescu1,2(B)
1
University “Al. I. Cuza” Iasi, Bd. Carol I 11, Iasi, Romania
zalinesc@uaic.ro
2
Octav Mayer Institute of Mathematics, Bd. Carol I 8, Iasi, Romania
Abstract. D.Y. Gao together with some of his collaborators applied his
Canonical duality theory (CDT) for solving a class of constrained opti-
mization problems. Unfortunately, in several papers on this subject there
are unclear statements, not convincing proofs, or even false results. It is
our aim in this work to study rigorously this class of constrained opti-
mization problems in finite dimensional spaces and to point out several
false results published in the last ten years.
1 Preliminaries
We consider the following constrained minimization problem
(PJ ) min f (x) s.t. x ∈ XJ ,
where J ⊂ 1, m,

XJ := x ∈ Rn | [∀j ∈ J : gj (x) = 0] ∧ [∀j ∈ J c : gj (x) ≤ 0]
with J c := 1, m \ J, f := g0 and

gk (x) := qk (x) + Vk (Λk (x)) x ∈ Rn , k ∈ 0, m ,
qk and Λk being quadratic functions on Rn , and Vk ∈ Γsc := Γsc (R) for k ∈ 0, m.

To be more precise, we take
qk (x) := 1
2 x, Ak x−bk , x+ck ∧ Λk (x) := 1
2 x, Ck x−dk , x+ek (x ∈ Rn )
with Ak , Ck ∈ Sn , bk , dk ∈ Rn (seen as column matrices), and ck , ek ∈ R for

k ∈ 0, m, where Sn denotes the set of n × n real symmetric matrices; of course,
c0 can be taken to be 0.
Γsc (Rp ) is the class of those functions h : Rp → R := R ∪ {−∞, +∞} which
are essentially strictly convex and essentially smooth, that is the class of proper
lsc convex functions of Legendre type (see [1, Sect. 26]). For h ∈ Γsc (Rp ) we
have: h∗ ∈ Γsc (Rp ), dom ∂h = int(dom h), and h is differentiable on int(dom h),
where the conjugate h∗ of h is defined by h∗ (σ) := sup{y, σ − h(y) | y ∈
https://doi.org/10.1007/978-3-030-21803-4_16
156 C. Zălinescu
Rp } ∈ R; moreover, ∇h : int(dom h) → int(dom h∗ ) is bijective and continuous

−1
with (∇h) = ∇h∗ . It follows that Γsc := Γsc (R) is the class of those proper
convex and lsc functions h : R → R with the property that h and h∗ are strictly
convex and derivable on the interior of their domains; hence h : int(dom h) →
int(dom h∗ ) is continuous, bijective and (h )−1 = (h∗ ) whenever h ∈ Γsc .
The problem (P1,m ) [resp. (P∅ )], denoted by (Pe ) [resp. (Pi )], is a minimiza-
tion problem with equality [resp. inequality] constraints whose feasible set is
Xe := X1,m [resp. Xi := X∅ ].
In many examples considered by D.Y. Gao and his collaborators, some func-
tions gk are quadratic, that is gk = qk ; to take this situation into account we
set
Q := {k ∈ 0, m | gk = qk }, Q0 := Q \ {0} = 1, m ∩ Q.
For k ∈ Q we take Λk := 0 and Vk (t) := 12 t2 for t ∈ R; then clearly Vk∗ = Vk ∈
Γsc . Clearly, Ck = 0 ∈ Sn , dk = 0 ∈ Rn and ek = 0 ∈ R for k ∈ Q. We use also
the notations
m
Ik := domVk , Ik∗ := domVk∗ (k ∈ 0, m), I ∗ := k=0 Ik∗ ; (1)
of course, Ik = Ik∗ = R for k ∈ Q. In order to simplify the writing, in the sequel
λ0 := λ0 := 1.
To the functions f (= g0 ) and (gj )j∈1,m we associate several sets and func-
tions. The Lagrangian L : X × Rm → R is defined by
m m
L(x, λ) := f (x) + λj gj (x) = λk [qk (x) + Vk (Λk (x))] ,
j=1 k=0
where λ := (λ1 , ..., λm ) ∈ R , and

T m
m
X : = x ∈ Rn | ∀k ∈ 0, m : Λk (x) ∈ domVk = k=0 Λ−1 (domVk ) ,
k
X0 := x ∈ R | ∀k ∈ 0, m : Λk (x) ∈ int(domVk ) ⊂ intX;
n
clearly X0 is open and L is differentiable on X0 . Using Gao’s procedure, we

consider the “extended Lagrangian” Ξ associated to f and (gj )j∈1,m :
m
Ξ : Rn ×R1+m ×I ∗ → R, Ξ(x, λ, σ) := λk [qk (x) + σk Λk (x) − Vk∗ (σk )] ,
k=0
∗
where I is defined in (1) and σ := (σ0 , σ1 , ..., σm ) ∈ R × Rm = R1+m . Clearly,
Ξ(·, λ, σ) is a quadratic function for every fixed (λ, σ) ∈ Rm × I ∗ .
Considering the mappings G : Rm × R1+m → Sn , F : Rm × R1+m → Rn ,
E : Rm × R1+m → R defined by
m m
G(λ, σ) := λk (Ak + σk Ck ), F (λ, σ) := λk (bk + σk dk ),
k=0 k=0
m
E(λ, σ) := λk (ck + σk ek ),
k=0
∗
for (λ, σ) ∈ R × I we have that
m
m
Ξ(x, λ, σ) = 1
2 x, G(λ, σ)x − F (λ, σ), x + E(λ, σ) − λk Vk∗ (σk ). (2)
k=0
On Constrained Optimization Problems Solved Using CDT 157
Remark 1. Note that G, F and E do not depend on σk for k ∈ Q. Moreover, G,

F and E are affine functions when 1, m ⊂ Q, that is Q0 = 1, m.
For λ ∈ Rm and J ⊂ 1, m we set
M= (λ) := {j ∈ 1, m | λj = 0}, 0

M= (λ) := M= (λ) ∪ {0}
and

ΓJ := λ ∈ Rm | λj ≥ 0 ∀j ∈ J c ⊃ Rm
+ := {λ ∈ R | λj ≥ 0 ∀j ∈ 1, m },
m
respectively; clearly,
Γ∅ = Rm
+, Γ1,m = Rm , ΓJ∩K = ΓJ ∩ ΓK ∀J, K ⊂ 1, m.
Useful relations among the Lagrangian L, the extended Lagrangian Ξ, and

the objective function f are provided in the next result.
Lemma 1. Let x ∈ X and J ⊂ 1, m. Then
L(x, λ) = sup Ξ(x, λ, σ) ∀λ ∈ ΓJ∩Q ,

σ∈IJ,Q
where
m {0} if k ∈ J ∩ Q,
IJ,Q := Ik∗∗ with Ik∗∗ :=
k=0 Ik∗ if k ∈ 0, m \ (J ∩ Q),
and

f (x) if x ∈ XJ∩Q ,
sup Ξ(x, λ, σ) = sup L(x, λ) =
(λ,σ)∈ΓJ∩Q ×IJ,Q λ∈ΓJ∩Q ∞ if x ∈ X \ XJ∩Q .
We consider also the sets

TQ := (λ, σ) ∈ Rm × I ∗ | det G(λ, σ) = ∅ ∧ [∀k ∈ Q : σk = 0] ,

TQ,col : = (λ, σ) ∈ Rm × I ∗ | F (λ, σ) ∈ Im G(λ, σ) ∧ [∀k ∈ Q : σk = 0] ⊇ TQ ,

TQJ+ := (λ, σ) ∈ TQ | λ ∈ ΓJ∩Q , G(λ, σ) 0 ,

J+
TQ,col := (λ, σ) ∈ TQ,col | λ ∈ ΓJ∩Q , G(λ, σ) 0 ⊇ TQJ+ ,
as well as the sets
T := T∅ , Tcol := T∅,col , T + := T∅∅+ , +

Tcol ∅+
:= T∅,col ;
in general TQJ+ and TQ,col

J+ +
are not convex, unlike their corresponding sets Y + , Ycol
+
and S + , Scol from [2,3], respectively. However, Tcol , TQJ+ and TQ,col
J+
are convex
whenever Q0 = 1, m. In the present context it is natural (in fact necessary) to
take λ ∈ ΓQ0 .
As in [2,3], we consider the (dual objective) function
D : Tcol → R, D(λ, σ) := Ξ(x, λ, σ) with G(λ, σ)x = F (λ, σ);

158 C. Zălinescu
D is well defined by [2, Lem. 1 (ii)]. For further use consider
ξ : T → Rn , ξ(λ, σ) := G(λ, σ)−1 F (λ, σ). (3)
For (λ, σ) ∈ T we obtain that

m
D(λ, σ) = − 12 F (λ, σ), G(λ, σ)−1 F (λ, σ) + E(λ, σ) − λk Vk∗ (σk ). (4)
k=0
Taking into account (2), we have that Ξ(·, λ, σ) is [strictly] convex for (λ, σ) ∈
+
Tcol [(λ, σ) ∈ T + ], and so
D(λ, σ) = minn Ξ(x, λ, σ) ∀(λ, σ) ∈ Tcol such that G(λ, σ) 0,

x∈R
the minimum being attained uniquely at ξ(λ, σ) when, moreover, G(λ, σ) 0.

The next result shows that the so-called “complimentary-dual principle” (or
“perfect duality formula”) holds under mild assumptions on the data.
∗
Proposition 1. Let (x, λ, σ) ∈ R ×R × I be such that ∇x Ξ(x, λ, σ) = 0,
n m
∂σ0 (x, λ, σ) = 0, and λ, ∇λ Ξ(x, λ, σ) = 0. Then (λ, σ) ∈ Tcol and

∂Ξ
f (x) = Ξ(x, λ, σ) = D(λ, σ). (5)
Other relations between L and Ξ are provided by the next result.
Lemma 2. Let (x, λ, σ) ∈ X0 × Rm × intI ∗ be such that ∇σ Ξ(x, λ, σ) = 0 and

σ k = 0 for k ∈ Q. Then L(x, λ) = Ξ(x, λ, σ) and ∇x L(x, λ) = ∇x Ξ(x, λ, σ).
Moreover, for j ∈ 1, m, ∂λ
∂L
j
(x, λ) ≥ ∂λ
∂Ξ
j
(x, λ, σ), with equality if j ∈ M= (λ)∪Q0 ;
in particular ∇λ L(x, λ) = ∇λ Ξ(x, λ, σ) if M= (λ) ⊃ Qc0 (= 1, m \ Q).
Observe that T ∩ (Rm × intI ∗ ) ⊂ intT , and for any σ ∈ I ∗ we have that the
set {λ ∈ Rm | (λ, σ) ∈ T } is open. Similarly to the computation of ∂D(λ)
∂λj in [2,
p. 5], using the expression of D(λ, σ) in (4), we get
∂D(λ, σ)
= qj (ξ(λ, σ)) + σj Λj (ξ(λ, σ)) − Vj∗ (σj ) ∀j ∈ 1, m, ∀(λ, σ) ∈ T,
∂λj
and
∂D(λ, σ)
= λk [Λk (ξ(λ, σ)) − Vk∗ (σk )] ∀k ∈ 0, m, ∀(λ, σ) ∈ T ∩ (Rm × intI ∗ ).
∂σk
Lemma 3. Let (λ, σ) ∈ (Rm × intI ∗ ) ∩ T and set x := ξ(λ, σ). Then
∇x Ξ(x, λ, σ) = 0 ∧ ∇λ Ξ(x, λ, σ) = ∇λ D(λ, σ) ∧ ∇σ Ξ(x, λ, σ) = ∇σ D(λ, σ).
In particular, (x, λ, σ) is a critical point of Ξ if and only if (λ, σ) is a critical

point of D.
Similarly to [2], we say that (x, λ) ∈ X0 × Rm is a J-LKKT point of L if

∇x L(x, λ) = 0 and

∀j ∈ J c : λj ≥ 0 ∧ ∂λ
∂L
j
(x, λ) ≤ 0 ∧ λ j
∂L
∂λj
(x, λ) = 0 ∧ ∀j ∈ J : ∂L
∂λj
(x, λ) = 0 ,
or, equivalently,

x ∈ XJ ∧ λ ∈ ΓJ ∧ ∀j ∈ J c : λj gj (x) = 0 ;
moreover, we say that x ∈ X0 is a J-LKKT point of (PJ ) if there exists λ ∈ Rm

such that (x, λ) is a J-LKKT point of L. Inspired by these notions, we say
that (x, λ, σ) ∈ Rn × Rm × intI ∗ is a J-LKKT point of Ξ if ∇x Ξ(x, λ, σ) = 0,
∇σ Ξ(x, λ, σ) = 0, ∂λ
∂Ξ
j
(x, λ, σ) = 0 for all j ∈ J and
∀j ∈ J c : λj ≥ 0 ∧ ∂Ξ
∂λj (x, λ, σ) ≤ 0 ∧ λj ∂λ
∂Ξ
j
(x, λ, σ) = 0,
and (λ, σ) ∈ (Rm × intI ∗ ) ∩ T is a J-LKKT point of D if ∇σ D(λ, σ) = 0 and

∀j ∈ J c : λj ≥ 0 ∧ ∂λ
∂D
j
(λ, σ) ≤ 0 ∧ λ j
∂D
∂λj
(λ, σ) = 0 ∧ ∀j ∈ J : ∂D
∂λj
(λ, σ) = 0 .
In the case in which J = ∅ we obtain the notions of KKT points for Ξ and
D. So, (x, λ, σ) ∈ Rn × Rm × intI ∗ is a KKT point of Ξ if ∇x Ξ(x, λ, σ) = 0,
∇σ Ξ(x, λ, σ) = 0 and

λ ∈ Rm + ∧ ∇λ Ξ(x, λ, σ) ∈ R− ∧
m
λ, ∇λ Ξ(x, λ, σ) = 0, (6)
where Rm − := {λ ∈ R
m
| λj ≤ 0 ∀j ∈ 1, m }, and (λ, σ) ∈ Rm × intI ∗ is a KKT
point of D if ∇σ D(λ, σ) = 0 and

λ ∈ Rm+ ∧ ∇λ D(λ, σ) ∈ R− ∧
m
λ, ∇λ D(λ, σ) = 0.
The definition of a KKT point for Ξ is suggested in the proof of [4, Th. 3].
Observe that (x, λ, σ) verifying the conditions in (6) is called critical point of Ξ
in [5, p. 477].
Corollary 1. Let (λ, σ) ∈ (Rm × intI ∗ ) ∩ T .
(i) If x := ξ(λ, σ), then (x, λ, σ) is a J-LKKT point of Ξ if and only if
(λ, σ) is a J-LKKT point of D.
(ii) If M= (λ) = 1, m, then (x, λ, σ) is a J-LKKT point of Ξ if and only if
(x, λ, σ) is a critical point of Ξ, if and only if x = ξ(λ, σ) and (λ, σ) is a critical
point of D.
Remark 2. Taking into account Remark 1, as well as (3) and Lemma 3, the
functions ∇x Ξ, ξ, ∇σ D do not depend on σk for k ∈ Q. Consequently, if (x, λ, σ)
is a J-LKKT point of Ξ then σ k = 0 for k ∈ Q ∩ M= (λ), and (x, λ, σ̃) is also
a J-LKKT point of Ξ, where σ̃k := 0 for k ∈ Q and σ̃k := σ k for k ∈ 0, m \ Q.
Conversely, taking into account that ∇σ D does not depend on σk for k ∈ Q, if
(λ, σ) ∈ T is a J-LKKT point of D then (λ, σ̃) is also a J-LKKT point of D,
where σ̃k := 0 for k ∈ Q and σ̃k := σ k for k ∈ 0, m \ Q.
160 C. Zălinescu
Having in view the previous remark, without loss of generality, in the sequel
(if not mentioned otherwise) we shall assume that σ k = 0 for k ∈ Q when
(x, λ, σ) ∈ Rn × Rm × intI ∗ is a J-LKKT point of Ξ, or (λ, σ) ∈ T is a J-LKKT
point of D.
2 The Main Result
The main result of the paper is the next one; in it we can see the roles of different
hypotheses in getting the main conclusion, that is the min-max duality formula
provided by Eq. (7).
Proposition 2. Let (x, λ, σ) ∈ Rn × Rm × intI ∗ be a J-LKKT point of Ξ such

that σ k = 0 for k ∈ Q.
(i) Then λ ∈ ΓJ , (λ, σ) ∈ TQ,col , λ, ∇λ Ξ(x, λ, σ) = 0, L(x, λ) = Ξ(x, λ, σ),
∇x L(x, λ) = 0, and (5) holds.
(ii) Moreover, assume that Qc0 ⊂ M= (λ). Then ∇λ L(x, λ) = ∇λ Ξ(x, λ, σ),
(x, λ) is a J-LKKT point of L and x ∈ XJ∪Qc0 .
(iii) Furthermore, assume that λj > 0 for all j ∈ Qc0 and G(λ, σ) 0. Then
x ∈ XJ∪Qc0 ⊂ XJ ⊂ XJ∩Q , (λ, σ) ∈ TQ,col J+
, and
f (x) = inf f (x) = Ξ(x, λ, σ) = L(x, λ) = sup D(λ, σ) = D(λ, σ);

x∈XJ∩Q J+
(λ,σ)∈TQ,col
(7)
moreover, if G(λ, σ) 0 then x is the unique global solution of problem (PJ∩Q ).
The variant of Proposition 2 in which Q is not taken into consideration, that

is the case when one does not observe that Vk ◦ Λk = 0 for some k (if any), is
much weaker; however, the conclusions coincide for Q = {0}.
Proposition 3. Let (x, λ, σ) ∈ Rn × Rm × intI ∗ bea J-LKKT point of Ξ.

(i) Then λ ∈ ΓJ , (λ, σ) ∈ Tcol , λ, ∇λ Ξ(x, λ, σ) = 0, L(x, λ) = Ξ(x, λ, σ),
∇x L(x, λ) = 0, and (5) holds.
(ii) Assume that M= (λ) = 1, m. Then ∇λ L(x, λ) = ∇λ Ξ(x, λ, σ) = 0, whence
(x, λ, σ) is a critical point of Ξ, (x, λ) is a critical point of L, and x ∈ Xe ⊂
XJ ⊂ Xi .
(iii) Assume that λ ∈ Rm ++ := int R+ and G(λ, σ) 0. Then x ∈ Xe ,
m
(λ, σ) ∈ Tcol and

+
f (x) = inf f (x) = Ξ(x, λ, σ) = L(x, λ) = sup D(λ, σ) = D(λ, σ);

x∈Xi +
(λ,σ)∈Tcol
moreover, if G(λ, σ) 0 then (λ, σ) ∈ T + and x is the unique global solution of

problem (Pi ).
The remark below refers to the case Q = ∅. A similar remark (but a bit less
dramatic) is valid for Q0 = ∅.
Remark 3. It is worth observing that given the functions f , g1 , ..., gm of type

q + V ◦ Λ with q, Λ quadratic functions and V ∈ Γsc , for any choice of J ⊂ 1, m
one finds the same x using Proposition 3 (iii). So, in practice, if one wishes to
solve one of the problems (Pe ), (Pi ) or (PJ ) using CDT, it is sufficient to find
those critical points (x, λ, σ) of Ξ such that λ ∈ Rm
++ and G(λ, σ) 0; if we are
successful, x ∈ Xe and x is the unique solution of (Pi ), and so x is also solution
for all problems (PJ ) with J ⊂ 1, m; moreover, (λ, σ) is a global maximizer of
+
D on Tcol .
The next example shows that the condition Qc0 ⊂ M= (λ) is essential for
x to be a feasible solution of problem (PJ ); moreover, it shows that, unlike
J+
the quadratic case (see [2, Prop. 9]), it is not possible to replace TQ,col by
{(λ, σ) ∈ Tcol | λ ∈ ΓJ , G(λ, σ) 0} in (7). The problem is a particular
case of the one considered in [7, Ex. 1], “which is very simple, but important
in both theoretical study and real-world applications since the constraint is a
so-called double-well function, the most commonly used nonconvex potential in
physics and engineering sciences [7]”;1 more precisely, q := 1, c := 6, d := 4,
e := 2.
Example 1. Let us take n = m = 1, J ⊂ {1}, q0 (x) := 12 x2 −6x, Λ1 (x) := 12 x2 −4,

q1 (x) := Λ0 (x) := 0, V0 (t) := V1 (t) + 2 := 12 t2 for x, t ∈ R. Then f (x) = 12 x2 − 6x
2
and g1 (x) = 12 12 x2 − 4 − 2. Hence Q = {0} (whence Q0 = ∅) and Xe =
√ √ √ √
{−2 3, 2 3, −2, 2} ⊂ [−2 3, −2] ∪ [2, 2 3] = Xi . Taking σ := (σ0 , σ1 ),

Ξ(x; λ; σ) = 12 x2 − 6x − 12 σ02 + λ σ1 12 x2 − 4 − 12 σ12 − 2 .
We have that G(λ; σ) = 1 + λσ1 , Tcol = T = {(λ; σ) ∈ R × R2 | 1 + λσ1 = 0} and

18
D(λ; σ) = − − 12 σ02 − λ 12 σ12 + 4σ1 + 2 .
1 + λσ1
√
The critical points of Ξ are (2; −1; (0, −2)), (−2; 2; (0, −2)), 6; 0; (0, 14 + 8 3) ,
√ √ √ √ √
6; 0; (0, 14 + 8 3) , −2 3; − 12 3 − 12 ; (0, 2) , 2 3; 12 3 − 12 ; (0, 2) , and so
√ √
1 + λσ 1 ∈ {3, −3, 1, − 3, 3} for (x, λ, σ) critical point of Ξ, whence (λ, σ)
(∈ T ) is critical point of D by Lemma 3. For λ = 0 the corresponding x (= 6)
is not in Xi ⊃ Xe ; in particular, (x, λ) is not a critical point of L. For λ = 0,
Proposition √ 2 says that (x, λ) is a critical point of L; in particular x ∈ Xe . For
λ ∈ {2, − 12 3 − 12 }, 1 + λσ 1 < 0, and so Proposition 2 says nothing about
√
the optimality of x or (λ, σ); in fact, for λ = − 12 3 − 12 , the corresponding
√ √
x (= −2 3) is the global maximizer of f on Xe . For λ := 12 3 − 12 > 0,
√ √
1 + λσ 1 = 3 > 0, and so Proposition √ 2 says that
x = 2 3 (∈ Xe ) is the global
solution of (Pi ), and (λ, σ) = 12 3 − 12 ; (0, 2) is a global maximizer of D on
+
Tcol = T + = {(λ, σ) ∈ R+ × R2 | 1 + λσ1 > 0}. For λ = −1, 1 + λσ 1 = 3 > 0,
1
The reference “[7]” is “Gao, D.Y.: Nonconvex semi-linear problems and canonical
duality solutions, in Advances in Mechanics and Mathematics II. In: Gao, D.Y.,
Ogden R.W. (eds.), pp. 261–311. Kluwer Academic Publishers (2003)”.
162 C. Zălinescu
but (λ, σ) is not a local extremum of D, as easily seen taking σ0 := 0, (λ, σ1 ) :=

(t − 1, t − 2) with |t| sufficiently small.
When Q = 0, m problem (PJ ) reduces to the quadratic problem with equality

and inequality quadratic constraints considered in [2, (PJ )], which is denoted
here by (PJq ) and whose Lagrangian and dual function are denoted by Lq and
Dq , respectively. Of course, in this case X = X0 = Rn , and so
m
Ξ(x, λ, σ) = Lq (x, λ) − 12 λk σk2 (x ∈ Rn , λ ∈ Rm , σ ∈ R × Rm )
k=0
with λ0 := 1. It follows that
∇x Ξ(x, λ, σ) = ∇x Lq (x, λ),∇σ Ξ(x, λ, σ) = − (λk σk )k∈0,m ,

∇λ Ξ(x, λ, σ) = ∇λ Lq (x, λ) − σj2 j∈1,m = qj (x) − 12 σj2 j∈1,m .
1
2
Moreover, G(λ, σ) = A(λ), F (λ, σ) = b(λ), E(λ, σ) = c(λ), and so T =

m
Y × R1+m , Tcol = Ycol × R1+m , D(λ, σ) = Dq (λ) − 12 k=0 λk σk2 , where A(λ),
b(λ), c(λ), Y , Ycol , Dq (denoted there by D) are introduced in [2]. Applying
Proposition 3 for this case we get the next result which is much weaker than [2,
Prop. 9].
Corollary 2. Let (x, λ) ∈ Rn × Rm be a J-LKKT point of L.

(i) Then λ ∈ YcolJ
:= Ycol ∩ ΓJ , λ, ∇λ Lq (x, λ) = 0, and q0 (x) = Lq (x, λ) =
Dq (λ).
(ii) Assume that M= (λ) = 1, m. Then ∇λ Lq (x, λ) = 0, and so (x, λ) is a
critical point of Lq , and x ∈ Xe ⊂ XJ ⊂ Xi .
(iii) Assume that λ ∈ Rm ++ and A(λ) 0. Then x ∈ Xe , λ ∈ Ycol and
+
q0 (x) = inf q0 (x) = Lq (x, λ) = sup Dq (λ) = Dq (λ);

x∈Xi i+
λ∈Ycol
moreover, if A(λ) 0 then λ ∈ Y i+ and x is the unique global solution of

problem (Pi ).
However, applying Proposition 2 we get assertion (i) and the last part of
assertion (ii) of [2, Prop. 9].
We have to remark that in all papers of D.Y. Gao on constrained optimization
problems in which CDT is used there is a result stating the “complementary-dual
principle”, and at least a result stating min-max duality formula. However, we
didn’t find a convincing proof of that min-max duality formula in these papers.
We mention below some results which are not true.
The problem considered by Gao, Ruan and Sherali in [5] is of type (Pi ).
Theorem 2 (Global Optimality Condition) of [5] is false because in the men-
tioned conditions x is not necessarily in Xi , as Example 1 shows. Also Theorem
1 (Complementary-Dual
Principle)
√ and Theorem 3 (Triality Theory) are false
because (λ, σ) = 0; (0, 14 + 8 3) is a critical point of D (by Lemma 3), while
the assertion “x is a KKT point of (P)” is not true because x = 6 ∈ / Xi . It is
shown in [6, Ex. 6] that the “double-min or double-max” duality of Theorem 3,

that is its assertion in the case G(λ, σ) ≺ 0, is also false.
The problem considered by Latorre and Gao in [7] is of type (PJ ) in which
Vk are “differentiable canonical functions”. As in the case of [5, Th. 1], Example
1 shows that Theorem 1 is false. Even without assuming that Sa+ is convex in
Theorem 2, for the same reason, this theorem is false.
The results established in Sects. 3 of Ruan and Gao’s paper [4] refer to (Pi )
in which qk = 0, Λk are Gâteaux differentiable on their domains and Vk are
“canonical functions” for k ∈ 0, m. As in the case of [5, Th. 1], Example 1 shows
that Theorem 3 is false.
The problem considered by Morales-Silva and Gao’s paper [8] refer to (Pe ) in
which m = 2 and Q = {1}. The equality max(λ,μ,ς)∈Sa+ Π d (λ, μ, ς) = Π d (λ, μ, ς)
from [8, Eq. (20)] √is not true. For this consider n := 1,√A := 1, r := α := η :=
c := 1 and f := 98 2; this is a particular case γ := 98 2 of the problem (P)
considered in [9].
References
1. Rockafellar, R. T.: Convex Analysis. Princeton University Press, Princeton, N.J.
(1972)
2. Zălinescu, C.: On quadratic optimization problems and canonical duality theory.
arXiv:1809.09032 (2018)
3. Zălinescu, C.: On unconstrained optimization problems solved using CDT and tri-
ality theory. arXiv:1810.09009 (2018)
4. Ruan, N., Gao, D.Y.: Canonical duality theory for solving nonconvex/discrete con-
strained global optimization problems. In: Gao, D.Y., Latorre, V., Ruan, N. (eds.)
Canonical Duality Theory. Advances in Mechanics and Mathematics, vol. 37, pp.
187–201. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-58017-3 9
5. Gao, D.Y., Ruan, N., Sherali, H.: Solutions and optimality criteria for nonconvex
constrained global optimization problems with connections between canonical and
Lagrangian duality. J. Global Optim. 45, 473–497 (2009)
6. Voisei, M.-D., Zălinescu, C.: Counterexamples to some triality and tri-duality
results. J. Global Optim. 49, 173–183 (2011)
7. Latorre, V., Gao, D.Y.: Canonical duality for solving general nonconvex constrained
problems. Optim. Lett. 10, 1763–1779 (2016)
8. Morales-Silva, D., Gao, D.Y.: On minimal distance between two surfaces. In: Gao,
D.Y., Latorre, V., Ruan, N. (eds.) Canonical Duality Theory. Advances in Mechanics
and Mathematics, vol. 37, pp. 359–371. Springer, Cham (2017). https://doi.org/10.
1007/978-3-319-58017-3 18
9. Voisei, M.-D., Zălinescu, C.: A counter-example to ‘minimal distance between two
non-convex surfaces’. Optimization 60, 593–602 (2011)
On Controlled Variational Inequalities
Involving Convex Functionals
Savin Treanţă(B)
Department of Applied Mathematics, University Politehnica of Bucharest,

Bucharest, Romania
savin treanta@yahoo.com
Abstract. In this paper, by using several variational techniques and a

dual gap-type functional, we study weak sharp solutions associated with
a controlled variational inequality governed by convex path-independent
curvilinear integral functional. Also, under some hypotheses, we establish
an equivalence between the minimum principle sufficiency property and
weak sharpness for a solution set of the considered controlled variational
inequality.
Keywords: Controlled variational inequality · Weak sharp solution ·

Convex path-independent curvilinear integral functional
1 Introduction
Convexity theory is an important foundation for studying a wide class of unre-
lated problems in a unified and general framework. Based on the notion of unique
sharp minimizer, introduced by Polyak [11], and taking into account the works
of Burke and Ferris [2], Patriksson [10], following Marcotte and Zhu [8], the
variational inequalities have been strongly investigated by using the concept of
weak sharp solution. We mention, in this respect, the works of Wu and Wu [15],
Oveisiha and Zafarani [9], Alshahrani et al. [1], Liu and Wu [6] and Zhu [16].
In this paper, motivated and inspired by the ongoing research in this area, we
introduce and investigate a new class of scalar variational inequalities. More pre-
cisely, by using several variational techniques presented in Clarke [3], Treanţă
[12,13] and Treanţă and Arana-Jiménez [14], we develop a new mathematical
framework on controlled continuous-time variational inequalities governed by
convex path-independent curvilinear integral functionals and, under some con-
ditions and using a dual gap-type functional, we provide some characterization
results for the associated solution set. As it is very well-known, the functionals
of mechanical work type, due to their physical meaning, become very important
in applications. Thus, the importance of this paper is supported both from the-
oretical and practical reasonings. As well, the ideas and techniques of this paper
may stimulate further research in this dynamic field.
Supported by University Politehnica of Bucharest, Bucharest, Romania (Grant No.
MA51-18-01).
https://doi.org/10.1007/978-3-030-21803-4_17
On Controlled Variational Inequalities Involving Convex Functionals 165
2 Notations, Working Hypotheses and Problem

Formulation
In this paper, we will consider the following notations and working hypotheses:
two finite dimensional Euclidean spaces, Rn and Rk ;
Θ ⊂ Rm is a compact domain in Rm and the point Θ t = (tβ ), β = 1, m,
is a multi-parameter of evolution;
for t1 = (t11 , . . . , tm m
1 ), t2 = (t2 , . . . , t2 ) two different points in Θ, let Θ ⊃
1
Υ : t = t(θ), θ ∈ [a, b] (or t ∈ t1 , t2 ) be a piecewise smooth curve joining the

points t1 and t2 in Θ;
for U ⊆ Rk , P := Θ × Rn × U and i = 1, n, β = 1, m, ς = 1, q, we define
the following continuously differentiable functions

V = Vβi : P → Rnm , W = (Wς ) : P → Rq ;
∂x
for xβ := , β = 1, m, let X be the space of piecewise smooth state
∂tβ n
functions x : Θ → R with the norm
m

x = x ∞ + xβ ∞ , ∀x ∈ X;
β=1
also, denote by U the space of piecewise continuous control functions

u : Θ → U, endowed with the uniform norm · ∞ ;
consider X × U equipped with the Euclidean inner product

(x, u); (y, w) = [x(t) · y(t) + u(t) · w(t)]dtβ , ∀(x, u), (y, w) ∈ X × U
Υ
and the induced norm;

denote by X × U a nonempty, closed and convex subset of X × U, defined
as
∂xi
X × U = {(x, u) ∈ X × U : β = Vβi (t, x, u) , W (t, x, u) ≤ 0,
∂t
x(t1 ) = x1 , x(t2 ) = x2 },
where x, u are the simplified notations for x(t), u(t) and x1 and
x2 are given;
assume the continuously differentiable functions Vβ = Vβi , i = 1, n, β =
1, m, satisfy the following conditions of complete integrability
Dζ Vβi = Dβ Vζi , β, ζ = 1, m, β = ζ, i = 1, n,
where Dζ denotes the total derivative operator;

for any two q-tuples a = (a1 , ..., aq ) , b = (b1 , ..., bq ) in Rq , the following
rules
a = b ⇔ aς = bς , a ≤ b ⇔ aς ≤ bς ,
a < b ⇔ aς < bς , a b ⇔ a ≤ b, a = b, ς = 1, q
166 S. Treanţă
are assumed.
Note. Further, in this paper, it is assumed summation on repeated indices.
In the following, J 1 (Rm , Rn ) denotes the first-order jet bundle associated to
R and Rn . For β = 1, m, we consider the real-valued continuously differentiable
m
functions (closed Lagrange 1-form densities) lβ , sβ , rβ : J 1 (Rm , Rn )×U → R and,

for (x, u) ∈ X × U, define the following path-independent curvilinear integral
functionals:

L, S, R : X × U → R, L(x, u) = lβ (t, x, xϑ , u) dtβ ,
Υ

β
S(x, u) = sβ (t, x, xϑ , u) dt , R(x, u) = rβ (t, x, xϑ , u) dtβ .
Υ Υ
Definition 2.1 The scalar functional L(x, u) is called convex on X × U if, for
any (x, u), (x0 , u0 ) ∈ X × U, the following inequality
L(x, u) − L(x0 , u0 )

∂lβ 0 0 0 ∂lβ 0 0 0
≥ t, x , xϑ , u (x − x ) +
0
t, x , xϑ , u Dϑ (x − x ) dtβ
0
Υ ∂x ∂xϑ

∂lβ 0 0 0
+ t, x , xϑ , u (u − u0 ) dtβ
Υ ∂u
is satisfied.
Definition 2.2 For β = 1, m, the variational (functional)

derivative δβ L(x, u)
of the scalar functional L : X × U → R, L(x, u) = lβ (t, x, xϑ , u) dtβ , is
Υ
defined as
δβ L δ β L
δβ L(x, u) = + ,
δx δu
with
δβ L ∂lβ ∂lβ
= (t, x, xϑ , u) − Dϑ (t, x, xϑ , u) ∈ X,
δx ∂x ∂xϑ
δβ L ∂lβ
= (t, x, xϑ , u) ∈ U
δu ∂u
and, for (ψ, Ψ ) ∈ X × U, with ψ(t1 ) = ψ(t2 ) = 0, the following relation

δβ L δ β L δβ L δβ L

( , ); (ψ, Ψ ) = (t) · ψ(t) + (t) · Ψ (t) dtβ
δx δu Υ δx δu
L(x + εψ, u + εΨ ) − L(x, u)

= lim
ε→0 ε
is satisfied.
Working assumptions. (i) In this work, it is assumed that the inner product
between the variational derivative of a scalar functional and an element (ψ, Ψ )
in X × U is accompanied by the condition ψ(t1 ) = ψ(t2 ) = 0.
(ii) Assume that
∂lβ
dU := Dϑ (x − x ) dtβ
0
∂xϑ
is an exact total differential and satisfies U (t1 ) = U (t2 ).
At this point, we have the necessary mathematical tools to formulate the
following controlled variational inequality problem: find (x0 , u0 ) ∈ X × U such
that

∂lβ 0 0 0 ∂lβ 0 0 0
(CV IP ) t, x , xϑ , u (x − x0 ) + t, x , xϑ , u Dϑ (x − x0 ) dtβ
Υ ∂x ∂xϑ

∂lβ 0 0 0
+ t, x , xϑ , u (u − u0 ) dtβ ≥ 0,
Υ ∂u
for any (x, u) ∈ X × U. The dual controlled variational inequality problem asso-
ciated to (CV IP ) is formulated as follows: find (x0 , u0 ) ∈ X × U such that

∂lβ ∂lβ
(DCV IP ) (t, x, xϑ , u) (x − x ) +
0
(t, x, xϑ , u) Dϑ (x − x ) dtβ
0
Υ ∂x ∂xϑ

∂lβ
+ (t, x, xϑ , u) (u − u ) dtβ ≥ 0,
0
Υ ∂u
for any (x, u) ∈ X × U.
Denote by (X × U)∗ and (X × U)∗ the solution set associated to (CV IP )
and (DCV IP ), respectively, and assume they are nonempty.
Remark 2.1 As it can be easily seen (see (ii) in Working assumptions), we can
reformulate the above controlled variational inequality problems as follows: find
(x0 , u0 ) ∈ X × U such that
δβ L δβ L
(CV IP )
( , ); (x − x0 , u − u0 ) ≥ 0, ∀(x, u) ∈ X × U,
δx0 δu0
respectively: find (x0 , u0 ) ∈ X × U such that
δβ L δβ L
(DCV IP )
( , ); (x − x0 , u − u0 ) ≥ 0, ∀(x, u) ∈ X × U.
δx δu
In the following, in order to describe the solution set (X × U)∗ associated
to (CV IP ), we introduce the following gap-type path-independent curvilinear
integral functionals.
Definition 2.3 For (x, u) ∈ X×U, the primal gap-type path-independent curvi-
linear integral functional associated to (CV IP ) is defined as

∂lβ
S(x, u) = 0 max { (t, x, xϑ , u) (x − x ) dtβ
0
(x ,u0 )∈X×U Υ ∂x
168 S. Treanţă

∂lβ ∂lβ
+ (t, x, xϑ , u) Dϑ (x − x ) +
0
(t, x, xϑ , u) (u − u ) dtβ },
0
Υ ∂xϑ ∂u
and the dual gap-type path-independent curvilinear integral functional associated
to (CV IP ) is defined as follows

∂lβ 0 0 0
R(x, u) = 0 max { t, x , x ϑ , u (x − x0
) dtβ
(x ,u0 )∈X×U Υ ∂x

∂lβ 0 0 0 ∂lβ 0 0 0
+ t, x , xϑ , u Dϑ (x − x0 ) + t, x , xϑ , u (u − u0 ) dtβ }.
Υ ∂xϑ ∂u
For (x, u) ∈ X × U, we introduce the following notations:

∂lβ
A(x, u) := {(z, ν) ∈ X × U : S(x, u) = (t, x, xϑ , u) (x − z) dtβ
Υ ∂x

∂lβ ∂lβ
+ (t, x, xϑ , u) Dϑ (x − z) + (t, x, xϑ , u) (u − ν) dtβ },
Υ ∂xϑ ∂u

∂lβ
Z(x, u) := {(z, ν) ∈ X × U : R(x, u) = (t, z, zϑ , ν) (x − z) dtβ
Υ ∂x

∂lβ ∂lβ
+ (t, z, zϑ , ν) Dϑ (x − z) + (t, z, zϑ , ν) (u − ν) dtβ }.
Υ ∂xϑ ∂u
In the following, in accordance with Marcotte and Zhu [8], we introduce some
central definitions.
Definition 2.4 The polar set (X × U)◦ associated to X × U is defined as

(X × U)◦ = (y, w) ∈ X × U :
(y, w); (x, u) ≤ 0, ∀(x, u) ∈ X × U .
Definition 2.5 The normal cone to X × U at (x, u) ∈ X × U is defined as
NX×U (x, u) = {(y, w) ∈ X × U :

(y, w), (z, ν) − (x, u) ≤ 0,
∀(z, ν) ∈ X × U}, (x, u) ∈ X × U,

NX×U (x, u) = ∅, (x, u) ∈ X × U
◦
and the tangent cone to X×U at (x, u) ∈ X×U is TX×U (x, u) = [NX×U (x, u)] .
Remark 2.2 By using the definition of normal cone at

(x, u) ∈ X × U,
δ β L δ β L
we observe the following: (x∗ , u∗ ) ∈ (X × U)∗ ⇐⇒ − ∗ ,− ∗ ∈
δx δu
∗ ∗
NX×U (x , u ).
3 Preliminary Results
In this section, in order to formulate and prove the main results of the paper,
several auxiliary propositions are established.
Proposition 3.1 Let the path-independent curvilinear integral functional
L(x, u) be convex on X × U. Then:
(i) the following equality

∂lβ 2 2 2 1 ∂lβ 2 2 2
t, x , xϑ , u (x − x2 ) + t, x , xϑ , u Dϑ (x1 − x2 ) dtβ
Υ ∂x ∂xϑ

∂lβ 2 2 2 1
+ t, x , xϑ , u (u − u ) dtβ = 0
2
Υ ∂u
is fulfilled, for any (x1 , u1 ), (x2 , u2 ) ∈ (X × U)∗ ;
(ii) (X × U)∗ ⊂ (X × U)∗ .
Remark 3.1 The property of continuity for the variational derivative δβ L(x, u)
implies (X × U)∗ ⊂ (X × U)∗ . By Proposition 3.1, we conclude (X × U)∗ =
(X×U)∗ . Also, the solution set (X×U)∗ associated to (DCV IP ) is a convex set
and, consequently, the solution set (X × U)∗ associated to (CV IP ) is a convex
set.
R(x, u) be differentiable on X × U. Then the following ineguality
δβ R δβ R δβ L δβ L

( , ); (v, μ) ≥
( , ); (v, μ)
δx δu δy δw
is satisfied, for any (x, u), (v, μ) ∈ X × U, (y, w) ∈ Z(x, u).
R(x, u) be differentiable on (X × U)∗ and the path-independent curvilinear inte-
gral functional L(x, u) be convex on X × U. Also, assume the following implica-
tion
δβ R δ β R δβ L δβ L δβ R δβ R δβ L δβ L

( , ); (v, μ) ≥
( , ); (v, μ) =⇒ ( ∗ , ∗ ) = ( , )
δx∗ δu∗ δz δν δx δu δz δν
is true, for any (x∗ , u∗ ) ∈ (X × U)∗ , (v, μ) ∈ X × U, (z, ν) ∈ Z(x∗ , u∗ ). Then
Z(x∗ , u∗ ) = (X × U)∗ , ∀(x∗ , u∗ ) ∈ (X × U)∗ .
4 Main Results
In this section, taking into account the preliminary results established in the pre-
vious section, we investigate weak sharp solutions for the considered controlled
variational inequality governed by convex path-independent curvilinear integral
functional. Concretely, following Marcotte and Zhu [8], in accordance with Fer-
ris and Mangasarian [4], the weak sharpness property of (X × U)∗ associated to
(CV IP ) is studied. In this regard, two characterization results are established.
170 S. Treanţă
Definition 4.1 The solution set (X×U)∗ associated to (CV IP ) is called weakly
sharp if there exists γ > 0 such that

δβ L δβ L
◦
γB ⊂ , + TX×U (x∗ , u∗ ) ∩ N(X×U)∗ (x∗ , u∗ ) , ∀(x∗ , u∗ ) ∈ (X × U)∗ ,
δx∗ δu∗
(see int(Q) the interior of the set Q and B the open unit ball in X × U), or,
equivalently,
⎛ ⎞

δβ L δβ L ◦
− ∗ , − ∗ ∈ int ⎝ TX×U (x, u) ∩ N(X×U)∗ (x, u) ⎠ ,
δx δu ∗
(x,u)∈(X×U)
for all (x∗ , u∗ ) ∈ (X × U)∗ .
Lemma 4.1 There exists γ > 0 such that

δβ L δβ L ◦
γB ⊂ , + TX×U (y, w) ∩ N(X×U)∗ (y, w) , ∀(y, w) ∈ (X × U)∗
δy δw
(1)
if and only if
δβ L δ β L

( , ); (z, ν) ≥ γ (z, ν) , ∀(z, ν) ∈ TX×U (y, w) ∩ N(X×U)∗ (y, w).
δy δw
(2)
The first characterization result of weak sharpness for (X×U)∗ is formulated
in the following theorem.
Theorem 4.1 Let the path-independent curvilinear integral functional R(x, u)

be differentiable on (X × U)∗ and the path-independent curvilinear integral func-
tional L(x, u) be convex on X × U. Also, assume that:
(a) the following implication

δβ R δβ R δβ L δβ L δβ R δβ R δβ L δβ L

( ∗ , ∗ ); (v, μ) ≥
( , ); (v, μ) =⇒ , = ,
δx δu δz δν δx∗ δu∗ δz δν
is true,
for any (x∗, u∗ ) ∈ (X × U)∗ , (v, μ) ∈ X × U, (z, ν) ∈ Z(x∗ , u∗ );
δβ L δ β L
(b) , is constant on (X × U)∗ .
δx∗ δu∗
Then (X × U)∗ is weakly sharp if and only if there exists γ > 0 such that
R(x, u) ≥ γd ((x, u), (X × U)∗ ) , ∀(x, u) ∈ X × U,
where d ((x, u), (X × U)∗ ) = min (x, u) − (y, w) .

(y,w)∈(X×U)∗
Proof. “=⇒” Consider (X × U)∗ is weakly sharp. Therefore, by Definition 4.1,

there exists γ > 0 such that (1) (or (2)) is fulfilled. Further, taking into account
the property of convexity for the solution set (X×U)∗ associated to (CV IP ) (see
Remark 3.1), it follows proj(X×U)∗ (x, u) = (ŷ, ŵ) ∈ (X × U)∗ , ∀(x, u) ∈ X × U
and, following Hiriart-Urruty and Lemaréchal [5], we obtain (x, u) − (ŷ, ŵ) ∈
TX×U (ŷ, ŵ) ∩ N(X×U)∗ (ŷ, ŵ). By hypothesis and Lemma 4.1, we get

∂lβ ∂lβ
(t, ŷ, ŷϑ , ŵ) (x − ŷ) + (t, ŷ, ŷϑ , ŵ) Dϑ (x − ŷ) dtβ
Υ ∂x ∂xϑ

∂lβ
+ (t, ŷ, ŷϑ , ŵ) (u − ŵ) dtβ ≥ γd((x, u), (X × U)∗ ), ∀(x, u) ∈ X × U.
Υ ∂u
(3)
Since

∂lβ ∂lβ
R(x, u) ≥ (t, ŷ, ŷϑ , ŵ) (x − ŷ) + (t, ŷ, ŷϑ , ŵ) Dϑ (x − ŷ) dtβ
Υ ∂x ∂xϑ

∂lβ
+ (t, ŷ, ŷϑ , ŵ) (u − ŵ) dtβ , ∀(x, u) ∈ X × U,
Υ ∂u
by (3), we obtain R(x, u) ≥ γd((x, u), (X × U)∗ ), ∀(x, u) ∈ X × U.
“⇐=” Consider there exists γ > 0 such that R(x, u) ≥ γd((x, u), (X × U)∗ ),
∗
∀(x, u) ∈ X × U. Obviously, for any (y, w) ∈ (X × U) , the case ◦ TX×U (y, w) ∩
N(X×U)∗ (y, w) = {(0,
0)} involves T X×U (y, w) ∩ N (X×U) ∗ (y, w) = X×U and,
δβ L δ β L ◦
consequently, γB ⊂ , + TX×U (y, w) ∩ N(X×U)∗ (y, w) , ∀(y, w) ∈
δy δw
∗
(X × U) is trivial. In the following, let (0, 0) = (x, u) ∈ TX×U (y, w) ∩
N(X×U)∗ (y, w) involving there exists a sequence (xk , uk ) converging to (x, u)
with (y, w) + tk (xk , uk ) ∈ X × U (for some sequence of positive numbers {tk }
decreasing to zero), such that
d((y, w) + tk (xk , uk ), (X × U)∗ ) ≥ d((y, w) + tk (xk , uk ), Hx,u )
tk
(x, u); (xk , uk )
= , (4)
(x, u)

where Hx,u = (x, u) ∈ X × U :
(x, u); (x, u) − (y, w) = 0 is a hyperplane
passing through (y, w) and orthogonal to (x, u). By hypothesis and (4), it
tk
(x, u); (xk , uk )
results R((y, w) + tk (xk , uk )) ≥ γ , or, equivalently (R(y, w) =
(x, u)
0, ∀(y, w) ∈ (X × U)∗ ),
R((y, w) + tk (xk , uk )) − R(y, w)

(x, u); (xk , uk )
≥γ . (5)
tk (x, u)
Further, by taking the limit for k → ∞ in (5) and using a classical result of
functional analysis, we obtain
R((y, w) + λ(x, u)) − R(y, w)

lim ≥ γ(x, u), (6)
λ→0 λ
172 S. Treanţă
where λ > 0. By Definition 2.2, the inequality (6) becomes
δβ R δβ R

( , ); (x, u) ≥ γ(x, u). (7)
δy δw
Now, taking into account the hypothesis and (7), for any (b, υ) ∈ B, it fol-
δβ L δ β L δβ R δβ R
lows
γ(b, υ) − ( , ); (x, u) =
γ(b, υ); (x, u) −
( , ); (x, u) ≤
δy δw δy δw
γ(x, u) − γ(x, u) = 0 and the proof is complete.
The second characterization result of weak sharpness for (X × U)∗ is based

on the notion of minimum principle sufficiency property, introduced by Ferris
and Mangasarian [4].
Definition 4.2 The controlled variational inequality (CV IP ) satisfies mini-

mum principle sufficiency property if A(x∗ , u∗ ) = (X × U)∗ , ∀(x∗ , u∗ ) ∈
(X × U)∗ .
Lemma 4.2 The inclusion arg max

(x, u); (y, w) ⊂ (X × U)∗ is fulfilled
⎛ (y,w)∈X×U
⎞
◦
for any (x, u) ∈ int ⎝ TX×U (x, u) ∩ N(X×U)∗ (x, u) ⎠ = ∅.
(x,u)∈(X×U)∗
Theorem 4.2 Let the solution set (X × U)∗ associated to (CV IP ) be weakly
sharp and the path-independent curvilinear integral functional L(x, u) be convex
on X × U. Then (CV IP ) satisfies minimum principle sufficiency property.
Theorem 4.3 Consider the functional R(x, u) is differentiable on (X×U)∗ and

the path-independent curvilinear integral functional L(x, u) is convex on X × U.
Also, for any (x∗ , u∗ ) ∈ (X × U)∗ , (v, μ) ∈ X × U, (z, ν) ∈ Z(x∗ , u∗ ), assume
the following implication

δβ R δ β R δβ L δβ L δβ R δβ R δβ L δβ L

( ∗ , ∗ ); (v, μ) ≥
( , ); (v, μ) =⇒ , = ,
δx δu δz δν δx∗ δu∗ δz δν

δβ L δβ L
is fulfilled and , is constant on (X × U)∗ . Then (CV IP ) satisfies
δx∗ δu∗
minimum principle sufficiency property if and only if (X × U)∗ is weakly sharp.
Proof. “=⇒” Let (CV IP ) satisfies minimum principle sufficiency property. In

consequence, A(x∗ , u∗ ) = (X × U)∗ , for any (x∗ , u∗ ) ∈ (X × U)∗ . Obviously, for
(x∗ , u∗ ) ∈ (X × U)∗ and (x, u) ∈ X × U, we obtain

∂lβ ∗ ∗ ∗ ∗ ∂lβ
R(x, u) ≥ (t, x , xϑ , u ) (x − x ) + (t, x , xϑ , u ) Dϑ (x − x ) dtβ
∗ ∗ ∗ ∗
Υ ∂x ∂xϑ

∂lβ
+ (t, x∗ , x∗ϑ , u∗ ) (u − u∗ ) dtβ . (8)
Υ ∂u
δβ L δ β L
Further, for P (x, u) =
( , ); (x, u), (x, u) ∈ X × U, we get
∗ ∗
δx∗ δu∗
A(x , u ) the solution set for min P (x, u). For other related investiga-
(x,u)∈X×U
tions, the readers are directed to Mangasarian and Meyer [7]. We can write
P (x, u) − P (x̃, ũ) ≥ γd((x, u), A(x∗ , u∗ )), ∀(x, u) ∈ X × U, (x̃, ũ) ∈ A(x∗ , u∗ ),
δβ L δ β L
or,
( ∗ , ∗ ); (x, u) − (x∗ , u∗ ) ≥ γd((x, u), (X × U)∗ ), ∀(x, u) ∈ X × U, or,
δx δu
equivalently,

∂lβ ∂lβ
(t, x∗ , x∗ϑ , u∗ ) (x − x∗ ) + (t, x∗ , x∗ϑ , u∗ ) Dϑ (x − x∗ ) dtβ
Υ ∂x ∂xϑ

∂lβ
+ (t, x∗ , x∗ϑ , u∗ ) (u − u∗ ) dtβ ≥ γd((x, u), (X×U)∗ ), ∀(x, u) ∈ X×U.
Υ ∂u
(9)
By (8), (9) and Theorem 4.1, we get (X × U)∗ is weakly sharp.
“⇐=” This is a consequence of Theorem 4.2.
References
1. Alshahrani, M., Al-Homidan S., Ansari, Q.H.: Minimum and maximum principle
sufficiency properties for nonsmooth variational inequalities. Optim. Lett. 10, 805–
819 (2016)
2. Burke, J.V., Ferris, M.C.: Weak sharp minima in mathematical programming.
SIAM J. Control Optim. 31, 1340–1359 (1993)
3. Clarke, F.H.: Functional Analysis, Calculus of Variations and Optimal Control.
Springer, London (2013)
4. Ferris, M.C., Mangasarian, O.L.: Minimum principle sufficiency. Math. Program.
57, 1–14 (1992)
5. Hiriart-Urruty, J.-B., Lemaréchal, C.: Fundamentals of Convex Analysis. Springer,
Berlin (2001)
6. Liu, Y., Wu, Z.: Characterization of weakly sharp solutions of a variational inequal-
ity by its primal gap function. Optim. Lett. 10, 563–576 (2016)
7. Mangasarian, O.L., Meyer, R.R.: Nonlinear perturbation of linear programs. SIAM
J. Control Optim. 17, 745–752 (1979)
8. Marcotte, P., Zhu, D.: Weak sharp solutions of variational inequalities. SIAM J.
Optim. 9, 179–189 (1998)
9. Oveisiha, M., Zafarani, J.: Generalized Minty vector variational-like inequalities
and vector optimization problems in Asplund spaces. Optim. Lett. 7, 709–721
(2013)
10. Patriksson, M.: A unified framework of descent algorithms for nonlinear programs
and variational inequalities. Ph.D. thesis, Linköping Institute of Technology (1993)
11. Polyak, B.T.: Introduction to Optimization. Optimization Software. Publications
Division, New York (1987)
12. Treanţă, S.: Multiobjective fractional variational problem on higher-order jet bun-
dles. Commun. Math. Stat. 4, 323–340 (2016)
13. Treanţă, S.: Higher-order Hamilton dynamics and Hamilton-Jacobi divergence
PDE. Comput. Math. Appl. 75, 547–560 (2018)
174 S. Treanţă
14. Treanţă, S., Arana-Jiménez, M.: On generalized KT-pseudoinvex control problems

involving multiple integral functionals. Eur. J. Control 43, 39–45 (2018)
15. Wu, Z., Wu, S.Y.: Weak sharp solutions of variational inequalities in Hilbert spaces.
SIAM J. Optim. 14, 1011–1027 (2004)
16. Zhu, S.K.: Weak sharp efficiency in multiobjective optimization. Optim. Lett. 10,
1287–1301 (2016)
On Lagrange Duality for Several Classes
of Nonconvex Optimization Problems
Ewa M. Bednarczuk1,2 and Monika Syga2(B)

1
Systems Research Institute, Polish Academy of Sciences, Newelska 6,
01447 Warsaw, Poland
Ewa.Bednarczuk@ibspan.waw.pl
2
Warsaw University of Technology, Faculty of Mathematics and Information
Science, ul. Koszykowa 75, 00662 Warsaw, Poland
M.Syga@mini.pw.edu.pl
Abstract. We investigate a general framework for studying Lagrange

duality in some classes of nonconvex optimization problems. To this aim
we use an abstract convexity theory, namely Φ-convexity theory, which
provides tools for investigating nonconvex problems in the spirit of con-
vex analysis (via suitably defined subdifferentials and conjugates). We
prove a strong Lagrangian duality theorem for optimization of Φlsc -
convex functions which is based on minimax theorem for general Φ-
convex functions. The class of Φlsc -convex functions contains among
others, prox-regular functions, DC functions, weakly convex functions
and para-convex functions. An important ingredient of the study is the
regularity condition under which our strong Lagrangian duality theorem
holds. This condition appears to be weaker than a number of already
known regularity conditions, even for convex problems.
Keywords: Abstract convexity · Φ-convexity · Minimax theorem ·

Lagrangian duality · Nonconvex optimization ·
Weakest constraint qualification condition
1 Introduction
Duality theory is an important tool in global optimization. The analysis of pairs

of dual problems provides a significant additional knowledge about a given opti-
mization problem and allows to construct algorithms for finding its global solu-
tions.
There exist numerous attempts to construct pairs of dual problems in non-
convex optimization e.g., for DC-functions [10], for composite functions [5] and
DC and composite functions [16].
Theory of Φ-convexity provides a general framework for dealing with some
classes of nonconvex optimization problem. This theory has been developed by [7,
11,14] and many others. Φ-convex functions are defined as pointwise suprema of
functions from a given class. Such an approach to abstract convexity generalizes
https://doi.org/10.1007/978-3-030-21803-4_18
176 E. M. Bednarczuk and M. Syga
the classical fact that each proper lower semicontinuous convex function is the
upper envelope of a certain set of affine functions. In the present paper we use Φ-
convexity to investigate the duality for a wide class of a nonconvex optimization
problems.
The aim of this paper is to investigate Lagrangian duality for optimization
problems involving Φlsc -convex functions. The class of Φlsc -convex functions con-
sists of lower semicontinuous functions defined on Hilbert spaces and minorized
by quadratic functions. This class embodies many important classes of functions
appearing in optimization, e.g. prox-regular functions [3], also known as prox-
bounded functions [12], DC (difference of convex) functions [20], weakly convex
functions [21], para-convex functions [13] and lower semicontinuous convex (in
the classical sense) functions.
Our main Lagrangian duality result (Theorem 4) is based on a general mini-
max theorem for Φ-convex functions as proved in [18]. An important ingredient
of Theorem 4 is condition (3) which can be viewed as a regularity condition
guarantying the strong duality. Condition (3) appears to be weaker than many
already known regularity conditions [4].
The organization of the paper is as follows. In Sect. 2 we recall basic facts
on Φ-convexity. In Sect. 3 we define subgradient for a particular class of Φ-
convex functions, namely the class of Φlsc -convex functions. In Sect. 4 we for-
mulate Lagrangian duality theorem in the class of Φlsc -convex functions (Theo-
rem 4). Condition (3) of Theorem 4 is a regularity condition ensuring that the
Lagrangian duality for optimization problem with Φlsc -convex functions holds.
Let us note that Φlsc -convex functions as defined above may admit +∞ values
which allows to consider also indicator functions in our framework.
2 Φ-Convexity
We start with definitions related to abstract convexity. Let X be a set. Let Φ be

a set of real-valued functions ϕ : X → R.
For any f, g : X → R̂ := R ∪ {−∞} ∪ {+∞}
f ≤ g ⇔ f (x) ≤ g(x) ∀x ∈ X.
Let f : X → R̂. The set
supp(f, Φ) := {ϕ ∈ Φ : ϕ ≤ f }
is called the support of f with respect to Φ. We will use the notation supp(f ) if
the class Φ is clear from the context.
Definition 1. ([7, 11, 14]) A function f : X → R̂ is called Φ-convex if
f (x) = sup{ϕ(x) : ϕ ∈ supp(f )} ∀ x ∈ X.

On Lagrange Duality of Nonconvex Optimization Problems 177
By convention, supp(f ) = ∅ if and only if f ≡ −∞. In this paper we always

assume that supp(f ) = ∅, i.e. we limit our attention to functions f : X → R̄ :=
R ∪ {+∞}.
We say that a function f : X → R̄ is proper if the effective domain of f is
nonempty, i.e.
dom(f ) := {x ∈ X : f (x) < +∞} = ∅.
In this paper we consider Φ-convex with respect to the following class Φlsc
Φlsc := {ϕ : X → R, ϕ(x) = −ax2 + , x+c, x ∈ X, ∈ X ∗ , a ≥ 0, c ∈ R},
where X is a Hilbert space.

In the following theorem we recall a characterization of Φlsc -convex functions.
Theorem 1. ([14], Example 6.2) Let X be a Hilbert space. Let f : X → R̄ be

lower semicontinuous on X. If supp(f ) = ∅, then f is Φlsc -convex.
The class of Φlsc -convex functions is very broad and contains many well
known classes of nonconvex functions appearing in optimization e.g prox-regular
(also called prox-bounded) functions [3,12], DC functions [15,20], weakly convex
functions [21], para-convex functions [6,13].
Let us note that the set of all Φlsc -convex functions defined on Hilbert space
X contains all proper lower semicontinuous and convex (in the classical sense)
functions defined on X.
3 Subgradients
Now we introduce the definition of subgradient of Φlsc -convex function. Such

subgradients were considered in [11,14] but in slightly different form.
Definition 2. An element (a, v) ∈ R+ × X is called a Φlsc -subgradient of a

function f : X → R̄ at x̄, if the following inequality holds
f (x) − f (x̄) ≥ v, x − x̄ − ax2 + ax̄2 , ∀ x ∈ X. (1)
The set of all Φlsc -subgradients of f at x̄ is denoted as ∂lsc f (x̄).
It can be shown that many subgradients, which appear in the literature, are Φlsc -
subgradients. Examples of such subgradients are: proximal subgradients [19],
subgradients for DC functions [1], for weakly convex functions [21], para-convex
functions [13] and classical subgradient for lower semicontinuous convex func-
tions.
3.1 Minimax Theorem for Φ-Convex Functions

Our Lagrangian duality theorem for Φlsc -convex functions is based on the general
minimax theorem for bifunctions which are Φ-convex with respect to one of
variables. In this section we recall the necessary definitions and the formulation
of the general minimax theorem for Φ-convex functions as appeared in [18].
For a given function a : X × Y → R̄ such that, for every y ∈ Y functions
a(·, y) are Φ-convex, a sufficient and necessary condition for the minimax equality
to hold is the so called intersection property. The intersection property was
investigated in [2,17,18].
Let ϕ1 , ϕ2 : X → R be functions from the set Φlsc and α ∈ R. We say that
the intersection property holds for ϕ1 and ϕ2 on X at the level if and only if
[ϕ1 < α] ∩ [ϕ2 < α] = ∅,
where [ϕ < α] := {x ∈ X : ϕ(x) < α} is the strict lower level set of a function
ϕ : X → R. The general minimax theorem for Φ-convex functions is proved in
Theorem 3.3.3 of [18]. In the case Φ = Φlsc Theorem 3.3.3 of [18] can be rewritten
in the following way.
Theorem 2. Let X, Y be Hilbert spaces and let a : X × Y → R̄. Assume that

for any y ∈ Y the function a(·, y) : X → R̄ is Φlsc -convex on X and for any
x ∈ X the function a(x, ·) : Y → R̄ is concave on Y . The following conditions
are equivalent:
(i) for every α ∈ R, α < inf sup a(x, y), there exist y1 , y2 ∈ Y and ϕ1 ∈
x∈X y∈Y
supp a(·, y1 ), ϕ2 ∈ supp a(·, y2 ) such that the intersection property holds for
ϕ1 and ϕ2 on X at the level α,
(ii) sup inf a(x, y) = inf sup a(x, y).
y∈Y x∈X x∈X y∈Y
To make the condition (i) of Theorem 2 operational we proved a slightly weaker

version of Theorem 2. Let β = inf sup a(x, y) < +∞.
x∈X y∈Y
Theorem 3. Let X, Y be a Hilbert spaces. Let a : X × Y → R̄ be a function

such that for any y ∈ Y the function a(·, y) : X → R is Φlsc -convex on X and for
any x ∈ X the function a(x, ·) : Y → R is concave on Y . If there exist y1 , y2 ∈ Y
and x̄ ∈ [a(·, y1 ) ≥ β] ∩ [a(·, y2 ) ≥ β] such that
0 ∈ co(∂a(·, y1 )(x̄) ∪ ∂a(·, y2 )(x̄))
then
sup inf a(x, y) = inf sup a(x, y).
and there exists ȳ ∈ Y such that inf a(x, ȳ) ≥ β.

x∈X
Proof. For the proof see [19].

Remark 1. Let us note that, if inf sup a(x, y) = −∞ then, the equality
x∈X y∈Y
sup inf a(x, y) = inf sup a(x, y) holds. If inf sup a(x, y) = +∞, then for min-
y∈Y x∈X x∈X y∈Y x∈X y∈Y
imax equality to hold we need to assume that the assumption of Theorem 3 is
true for every β < +∞.
4 Lagrange Duality for Φ-Convex Functions

Lagrangian duality for functions satisfying some generalized convexity conditions
has been recently investigated by many authors e.g. for prox-regular functions
see [9], for DC functions see [8].
The following construction of the Lagrangian for Φlsc -convex functions is
based on the general Lagrangian for Φ-convex functions introduced in [7] and
investigated in [11,14].
Let X, Y be Hilbert spaces. Let us consider the optimization problem
Min f (x), x∈X (P ),
where f : X → R ∪ {+∞} is a Φlsc -convex function.
Let p(x, y) : X × Y → R̄ be a function satisfying
p(x, y0 ) = f (x),
where y0 ∈ Y . Consider the family of problems
Min p(x, y), x∈X (Py ).
The Lagrangian L : X × R+ × Y ∗ :→ R̄ is defined as
L(x, a, v) = −ay0 2 + v, y0 − p∗x (a, v), (2)
where p∗x (a, v) = sup {−ay2 + v, y − p(x, y)} is the ΦYlsc -conjugate of the
y∈Y
function p(x, ·). The problem
inf sup L(x, a, v).
x∈X (a,v)∈R+ ×Y ∗
is equivalent to (P ) if and only if the function p(x, ·) is ΦYlsc -convex function on

Y for all X. Indeed,
sup L(x, a, v) = sup {−ay0 2 + v, y0 − p∗x (a, v)} =
(a,v)∈R+ ×Y ∗ (a,v)∈R+ ×Y ∗
= p∗∗
x (y0 ) = p(x, y0 ).
The dual problem to (P ) is defined is defined as follows
sup inf L(x, a, v) (D).
(a,v)∈R+ ×Y ∗ x∈X
Let β := inf sup L(x, a, v). The following theorem is based on the gen-
x∈X (a,v)∈R+ ×Y ∗
eral minimax theorem.
Theorem 4. Let X, Y be Hilbert spaces. Let L : X × R+ × Y ∗ → R̄ be the

Lagrangian defined by (2). Assume that for any (a, v) ∈ R+ × Y ∗ the function
L(·, a, v) : X → R is Φlsc -convex on X.
If there exist (a1 , v1 ), (a2 , v2 ) ∈ R+ × Y ∗ and x̄ ∈ [L(·, a1 , v1 ) ≥ β] ∩
[L(·, a2 , v2 ) ≥ β] such that
0 ∈ co(∂L(·, a1 , v1 )(x̄) ∪ ∂L(·, a2 , v2 )(x̄)) (3)
then
sup inf L(x, a, v) = inf sup L(x, a, v)
(a,v)∈R+ ×Y ∗ x∈X x∈X (a,v)∈R+ ×Y ∗
and there exists (ā, v̄) ∈ R+ × Y ∗ ∈ Y such that inf L(x, ā, v̄) ≥ β
x∈X
Proof. Follows immediately from Theorem 3.
Example 1. Consider the problem (P ) with X = R and

⎧
⎨1, if x > 0,
f (x) = −1, if x = 0, .
⎩
+∞, if x < 0
Then, it is easy to see that β = −1 and

⎧ 2
⎨ax − bx, if x > 0,
L(x, a, b) = −1, if x = 0,
⎩
+∞, x<0
where a ≥ 0, b, x ∈ R. Let a1 = 1, b1 = 0, then [L(·, a1 , b1 ) ≥ −1] = R. Let a2 =

0, b2 = 0 we have [L(·, a2 , b2 ) ≥ −1] = R. It is easy that −1 ∈ ∂Φlsc L(·, a1 , b1 )(0),
so, by Theorem 4, the optimal values of the primal and dual problem are equal.
It can be shown through numerous examples that condition (3) works in some
cases where other relatively weak conditions are not satisfied. Moreover, the
condition (3) is close to be necessary for the conclusion of Theorem 4 to hold.
5 Conclusions
The most important ingredient of our work is the condition (3). It would be
interesting to derive equivalent form of the condition (3) in order to make it
more easy to check.
References
1. Bačák, M., Borwein, J.M.: On difference convexity of locally Lipschitz
functions. Optimization 60(8–9), 961–978 (2011). https://doi.org/10.1080/
02331931003770411
2. Bednarczuk, E.M., Syga, M.: On minimax theorems for lower semicontinuous func-
tions in Hilbert spaces. J. Convex Anal. 25(2), 389–402 (2018)
3. Bernard, F., Thibault, L.: Prox-regular functions in Hilbert spaces. J. Math. Anal.
Appl. 303(1), 1–14 (2005). https://doi.org/10.1016/j.jmaa.2004.06.003
4. Boţ, R.I., Csetnek, E.R.: Regularity conditions via generalized interiority notions
in convex optimization: new achievements and their relation to some classical state-
ments. Optimization 61(1), 35–65 (2012). https://doi.org/10.1080/02331934.2010.
505649
5. BoŢ, R.I., Wanka, G.: Duality for composed convex functions with applications
in location theory, pp. 1–18. Deutscher Universitätsverlag, Wiesbaden (2003).
https://doi.org/10.1007/978-3-322-81539-2
6. Cannarsa, P., Sinestrari, C.: Semiconcave Functions, Hamilton-Jacobi Equations,
and Optimal Control, vol. 58. Birkhauser Boston, MA (2004)
7. Dolecki, S., Kurcyusz, S.: On φ-convexity in extremal problems. SIAM J. Control
Optim. 16, 277–300 (1978)
8. Harada, R., Kuroiwa, D.: Lagrange-type duality in DC programming. J. Math.
Anal. Appl. 418(1), 415–424 (2014). https://doi.org/10.1016/j.jmaa.2014.04.017
9. Hare, W., Poliquin, R.: The quadratic Sub-Lagrangian of a prox-regular func-
tion. Nonlinear Anal. 47, 1117–1128 (2001). https://doi.org/10.1016/S0362-
546X(01)00251-6
10. Martı́nez-Legaz, J.E., Volle, M.: Duality in D.C. programming: the case of several
D.C. constraints. J. Math. Anal. Appl. 237(2), 657–671 (1999). https://doi.org/
10.1006/jmaa.1999.6496
11. Pallaschke, D., Rolewicz, S.: Foundations of Mathematical Optimization. Kluwer
Academic (1997)
12. Rockafellar, R., Wets, R.J.B.: Variational Analysis. Springer, Berlin (1998)
13. Rolewicz, S.: Paraconvex analysis. Control Cybern. 34, 951–965 (2005)
14. Rubinov, A.M.: Abstract Convexity and Global Optimization. Kluwer Academic,
Dordrecht (2000)
15. Singer, I.: Duality for D.C. optimization problems, pp. 213–258. Springer, New
York (2006). https://doi.org/10.1007/0-387-28395-1
16. Sun, X., Long, X.J., Li, M.: Some characterizations of duality for DC optimization
with composite functions. Optimization 66(9), 1425–1443 (2017). https://doi.org/
10.1080/02331934.2017.1338289
17. Syga, M.: Minimax theorems for φ-convex functions: sufficient and necessary con-
ditions. Optimization 65(3), 635–649 (2016). https://doi.org/10.1080/02331934.
2015.1062010
18. Syga, M.: Minimax theorems for extended real-valued abstract convex-concave
functions. J. Optim. Theory Appl. 176(2), 306–318 (2018). https://doi.org/10.
1007/s10957-017-1210-4
19. Syga, M.: Minimax theorems via abstract subdifferential. preprint (2019)
20. Tuy, H.: D.C. Optimization: theory, methods and algorithms (1995). https://doi.
org/10.1007/978-1-4615-2025-2
21. Vial, J.P.: Strong and weak convexity of sets and functions. Math. Oper. Res. 8(2),
231–259 (1983). https://doi.org/10.1287/moor.8.2.231
On Monotone Maps: Semidifferentiable
Case
Shashi Kant Mishra(B) , Sanjeev Kumar Singh, and Avanish Shahi
Department of Mathematics, Institute of Science, Banaras Hindu University,

Varanasi 221005, India
bhu.skmishra@gmail.com, sksingh20894@gmail.com, avanishshahi123@gmail.com
Abstract. In this paper, we define the concepts of monotonicity and

generalized monotonicity for semidifferentiable maps. Further, we present
the characterizations of convexity and generalized convexity in case of
semidifferentiable functions. These results rely on general mean-value
theorem for semidifferentiable functions (J Glob Optim 40:503–508,
2010).
Keywords: Generalized convexity · Generalized monotonicity ·

First-order conditions · Semidifferentials
1 Introduction
Karamardian and Schaible [3] discussed the concepts of monotone and gen-
eralized monotone maps. In that paper, Karamardian and Schaible [3] estab-
lished the relationships between convex/generalized convex functions and mono-
tone/generalized monotone maps of their gradient. The theory of generalized
monotone maps plays an important role in variational inequalities [5], comple-
mentarity problems [1], and equilibrium problems [15].
On the other hand, the differentiable functions can be extended to non-
differentiable functions, as subdifferentials [2], and semidifferentials [6] for single-
valued as well as set-valued maps [8]. Kaul and Kaur [6] introduced the con-
cepts of semidifferentials and discussed the properties of locally star-shaped sets
and semilocally generalized convex functions with the help of semidifferentials.
Although, the oldest trace of Hadamard semidifferentiability has been already
seen in two articles by Durdil [12] and Penot [13]. Further, Delfour and Zolésio
[14] gave a rather complete treatment of Hadamard semidifferentials in infinite
dimensions, where it is the natural tool for shape optimization. In the study of
non-differentiable functions, the concepts of semidifferentials have more impor-
tance than subdifferentials [7]. Particularly, the convex continuous functions can
be characterized by semidifferentiable functions so the solution of a minimiza-
tion problem with convex continuous objective function over a convex set can
be found with the help of semidifferentials [7]. For obtaining the necessary opti-
mality conditions of an optimization problem with the non-convex objective
https://doi.org/10.1007/978-3-030-21803-4_19
On Monotone Maps: Semidifferentiable Case 183
function, the weaker assumptions of lower or upper semidifferentials can be used

[7].
The characterizations of semidifferentiable convex functions were not possible
until 2010, due to the fact that first of all the Mean-Value Theorem for semidif-
ferentiable functions was given by Castellani and Pappalardo [11] in 2010.
Motivated by the work of Karamardian and Schaible [3], Penot and Quang [8],
Komlósi [9] and Kaul and Kaur [6], we establish the first order characterizations
of convexity/generalized convexity for semidifferentiable functions. Further, we
give examples in support of our results.
2 Preliminaries
2.1 Semidifferentials
Definition 2.1. [6] Let f be a numerical function defined on a set U ⊆ Rn ,
then the semidifferential of f at y in direction of x − y is denoted by (df )+ and
defined by
f (y + λ(x − y)) − f (y)
(df )+ (y, x − y) = lim ,
λ→0+ λ
provided the limit exists.
Definition 2.2. [7] Let f be a numerical function defined on a set U ⊆ Rn ,
then the Hadamard semidifferential of f at y in direction of u is denoted by df
and defined by
f (y + λw) − f (y)
(df )(y, u) = lim ,
λ→0 + λ
w→u
provided the limit exists.
Remark 2.1. Hadamard semidifferentials coincide with the semidifferentials, if
the direction u = x − y.
Theorem 2.1. [7] (Mean-Value Theorem for Semidifferentials)
Let f : Rn → R be a function, x, u ∈ Rn . Let the function t → s(t) = f (x + tu)
be continuous on [0, 1] and differentiable on (0, 1), then ∃ λ ∈ (0, 1) such that
f (x + u) = f (x) + (df )+ (x + λu, u).
2.2 Convexity in Semidifferentible Case

Theorem 2.2. [6] Let f be a semidifferentiable function on a non-empty set
U ⊆ Rn . Then, f is convex on U if and only if
f (x) − f (y) ≥ (df )+ (y, x − y), ∀x, y ∈ U.
Theorem 2.3. [7] Let f : Rn → R be a convex on convex neighborhood U of a
point x ∈ Rn . Then (df )+ (x, v) exists in all direction u ∈ Rn and
(df )+ (x, u) + (df )+ (x, −u) ≥ 0.
184 S. K. Mishra et al.
Corollary 2.1. [7] If f : Rn → R is convex in Rn , then for all u ∈ Rn and for

all x ∈ Rn ,
f (x) − f (x − u) ≤ −(df )+ (x, −u) ≤ (df )+ (x, u) ≤ f (x + u) − f (x).
Theorem 2.4. [7] Let f : U → R be convex function on an open and convex

subset U ⊆ Rn . Then, for each x ∈ U, the function
u → (df )+ (x, u) : Rn → R
is positively homogeneous, convex, and subadditive, i.e.
∀u, v ∈ Rn , (df )+ (x, u + v) ≤ (df )+ (x, u) + (df )+ (x, v).
2.3 Pseudoconvexity and Quasiconvexity

Definition 2.3. [6] A semidifferentiable function f on an open convex subset
U of Rn is said to be pseudoconvex on U if, for every pair of distinct points
x, y ∈ U, we have
(df )+ (y, x − y) ≥ 0 =⇒ f (x) ≥ f (y).
Definition 2.4. [10] A function f is quasiconvex on a convex set U of Rn if, for

all x, y ∈ U , λ ∈ [0, 1], we have
f (x) ≤ f (y) ⇒ f (λx + (1 − λ)y) ≤ f (y).
Proposition 2.1. [6] A semidifferentiable function f is quasiconvex on an open

convex set U of Rn if, for every pair of distinct points x, y ∈ U , we have
f (x) ≤ f (y) =⇒ (df )+ (y, x − y) ≤ 0.
Remark 2.2. [10] Every pseudoconvex function is quasiconvex but converse is

not necessarily true.
2.4 Monotonicity and Geneneralized Monotonicity

Definition 2.5. [4] Let U be a subset of Rn and F be a map from U into Rn .
F is monotone on U if, for every pair of distinct points x, y ∈ U, we have
F (x) − F (y), x − y ≥ 0.
Remark 2.3. Monotonicity of semidifferentiable map of f is implied by
(df )+ (x, x − y) − (df )+ (y, x − y) ≥ 0, ∀x, y ∈ U.
Definition 2.6. [1] Let U be non-empty subset of Rn . A map F : U → Rn is

said to be pseudomonotone if ∀x, y ∈ U , x = y, we have
F (y), x − y ≥ 0 ⇒ F (x), x − y ≥ 0.
Remark 2.4. Pseudomonotonicity of semidifferentiable map of f is implied by
(df )+ (y, x − y) ≥ 0 =⇒ (df )+ (x, x − y) ≥ 0, ∀x, y ∈ U.
Proposition 2.2. [3] A map F : U → Rn is pseudomonotone on U if and only

if ∀x = y ∈ U, we have
F (y), x − y > 0 ⇒ F (x), x − y > 0.
Definition 2.7. [3] A map F from U into Rn is quasimonotone on an open

convex subset U of Rn if for every pair of distinct points x, y ∈ U, we have
F (y), x − y > 0 ⇒ F (x), x − y ≥ 0.
Remark 2.5. Quasimonotonicity of semidifferentiable map of f is implied by
(df )+ (y, x − y) > 0 =⇒ (df )+ (x, x − y) ≥ 0, ∀x, y ∈ U.
3 Main Results
Theorem 3.1. Let U be a non-empty open and convex subset of Rn . Then, a
semidifferentiable function f : U → R is convex if and only if (df )+ (., x − y) is
monotone on U .
Proof. Suppose the semidifferential function f is convex on U, then
f (x) − f (y) ≥ (df )+ (y, x − y), ∀x, y ∈ U. (1)
Interchanging the role of x and y, we get
f (y) − f (x) ≥ (df )+ (x, y − x), ∀x, y ∈ U. (2)
Adding inequalities (1) and (2), we get
0 ≥ (df )+ (y, x − y) + (df )+ (x, y − x), ∀x, y ∈ U.
0 ≥ (df )+ (y, x − y) − (df )+ (x, x − y), ∀x, y ∈ U.

i.e.
(df )+ (x, x − y) − (df )+ (y, x − y) ≥ 0, ∀x, y ∈ U.
Therefore, (df )+ (., x − y) is monotone on U.
Conversely, suppose (df )+ (., x − y) is monotone on U, i.e.
(df )+ (x, x − y) − (df )+ (y, x − y) ≥ 0, ∀x, y ∈ U. (3)
By the Mean-Value Theorem, ∃ z = λx + (1 − λ)y for some λ ∈ (0, 1), such that
1
f (x) − f (y) = (df )+ (z, x − y) = (df )+ (z, z − y). (4)
λ
Since U is convex, z, y ∈ U, then by (3), we have

1 1
(df )+ (z, z − y) ≥ (df )+ (y, z − y). (5)
λ λ
From (4) and (5), we have
f (x) − f (y) ≥ (df )+ (y, x − y), ∀x, y ∈ U.
Therefore, f is convex on U.
Example 3.1. Let f : R → R be defined by,
f (x) = max{ex , e−x }, x ∈ R.
Here, f is not differentiable but semidifferentiable and convex, the semidifferen-

tial of f at y ∈ R in the direction of x − y is
−y
e (y − x) for x < 0,
(df )+ (y, x − y) =
ey (x − y) for x ≥ 0.
It is easy to see that (df )+ (., x − y) is monotone map (Fig. 1).
0
-2 -1 0 1 2
Fig. 1. Semidifferentiable function f

semidifferentiable function f : U → R is pseudoconvex if and only if (df )+ (., x −
y) is pseudomonotone on U .
Proof. Suppose that the semidifferential function f is pseudoconvex on U, then
(df )+ (y, x − y) ≥ 0 =⇒ f (x) ≥ f (y), ∀x = y ∈ U. (6)

We have to show that

(df )+ (x, x − y) ≥ 0. (7)
Suppose that (7) does not hold, i.e. (df )+ (x, x − y) < 0.
i.e. (df )+ (x, y − x) > 0 =⇒ f (y) ≥ f (x), which contradicts (6).
Hence, (df )+ (y, x − y) ≥ 0 =⇒ (df )+ (x, x − y) ≥ 0.
Therefore, (df )+ (., x − y) is pseudomonotone on U.
Conversely, suppose that (df )+ (., x − y) is pseudomonotone on U. i.e.
(df )+ (y, x − y) ≥ 0 =⇒ (df )+ (x, x − y) ≥ 0, ∀x = y ∈ U. (8)
We have to show that f (x) ≥ f (y).

Suppose on contrary f (x) < f (y).
By the Mean-Value Theorem ∃ z = λx + (1 − λ)y for some λ ∈ (0, 1), such that
0 > f (x) − f (y) = (df )+ (z, x − y). (9)
i.e., (df )+ (z, x − y) < 0,

z − y
=⇒ (df )+ z, < 0,
λ
(df )+ (z, z − y) < 0, ∀y = z ∈ U.
By Proposition 2.2, we have
(df )+ (y, z − y) < 0,
i.e. (df )+ (y, λ(x − y)) < 0, ∀x = y ∈ U, for some λ ∈ (0, 1),
i.e. (df )+ (y, x − y) < 0, (∵ λ > 0)
which contradicts the assumption of pseudomonotonicity of (df )+ (., x−y). Thus,
f is pseudoconvex on U.
Example 3.2. Let f : U = [−π, π] → R be defined by
f (x) = x + sinx.
Here, f is semidifferentiable and pseudoconvex on U = [−π, π].

The semidifferential of f is given by (df )+ (y, x − y) = (x − y)(1 + cosy), which
is pseudomonotone on U.

semidifferentiable function f : U → R is quasiconvex if and only if (df )+ (., x−y)
is quasimonotone on U .
Proof. Suppose that the semidifferentiable function f is quasiconvex on U, i.e.
f (x) ≤ f (y) =⇒ (df )+ (y, x − y) ≤ 0, ∀x = y ∈ U. (10)

Let for any x, y ∈ U, x = y be such that
(df )+ (y, x − y) > 0. (11)
We have to show that (df )+ (x, x − y) ≥ 0.

From the inequality (10), we get
(df )+ (y, x − y) > 0 =⇒ f (x) > f (y).
∵ f is quasiconvex on U,
f (y) < f (x) =⇒ (df )+ (x, y − x) ≤ 0.
i.e., (df )+ (x, x − y) ≥ 0.

Hence, (df ) (., x − y) is quasimonotone on U.
+
Conversely, suppose that (df )+ (., x − y) is quasimonotone on U. Assume that

f is not quasiconvex, then ∃ x, y ∈ U such that
f (x) ≤ f (y) and λ̄ ∈ (0, 1) such that for x̄ = λ̄x + (1 − λ̄)y and
f (λ̄x + (1 − λ̄)y) > f (y),

f (x) ≤ f (y) < f (x̄). (12)
By the Mean-Value Theorem ∃ x̂ = λ̂x + (1 − λ̂)y and x∗ = λ∗ x + (1 − λ∗ )y such
that
f (x̄) − f (x) = (df )+ (x̂, x̄ − x), (13)
and
f (x̄) − f (y) = (df )+ (x∗ , x̄ − y). (14)
Here, 0 < λ∗ < λ̄ < λ̂ < 1.
λ∗ and λ̂ are close to 0 and 1, respectively. From statements (12) and (13),
we get
(df )+ (x̂, x̄ − x) > 0. (15)
From statements (12) and (14), we get
(df )+ (x∗ , x̄ − y) > 0. (16)
(df )+ (x̂(1 − λ̄)(y − x)) > 0, (∵ x̄ − x = (1 − λ̄)(y − x)) (17)
and
(df )+ (x∗ , λ̄(x − y)) > 0. (∵ x̄ − y = λ̄(x − y)) (18)
∵ 1 − λ̄ > 0 and λ̄ > 0, hence from (17) and (18), we have
(df )+ (x̂, y − x) > 0, (19)
and
(df )+ (x∗ , x − y) > 0. (20)
Inequalities (19) and (20) can be rewritten as

x∗ − x̂
(df )+ x̂, > 0,
λ̂ − λ∗
i.e.
(df )+ (x̂, x∗ − x̂) > 0, (21)
and x̂ − x∗
(df )+ x∗ , > 0,
λ̂ − λ∗
i.e.
(df )+ (x∗ , x∗ − x̂) < 0. (22)
(df )+ (x̂, x∗ − x̂) > 0 =⇒ (df )+ (x∗ , x∗ − x̂) < 0,
which is a contradiction to quasimonotonicity of (df )+ (., x − y) on U.

Therefore, f is quasiconvex on U.
Remark 3.1. Theorem 3.2 and Theorem 3.3 extend the Theorem 3.1 of Kara-
mardian [1] and Proposition 5.2 of Karamardian and Schaible [3], respectively,
in semidifferentiable case.
Example 3.3. Let f : R → R be defined by

⎧
⎨ 2x + 1 for x ≤ −1,
f (x) = x for −1 ≤ x ≤ 1,
⎩
2x − 1 for x ≥ 1.
Here, f is semidifferentiable and quasiconvex on R.

The semidifferential of f is given by

x−y for −1 ≤ x ≤ 1,
(df )+ (y, x − y) =
2(x − y) otherwise.
Which is quasimonotone on R.
Acknowledgements. The first author is financially supported by Department of Sci-

ence and Technology, SERB, New Delhi, India, through grant no.: MTR/2018/000121.
The second author is financially supported by CSIR-UGC JRF, New Delhi, India,
through Reference no.: 1272/(CSIR-UGC NET DEC.2016). The third author is finan-
cially supported by UGC-BHU Research Fellowship, through sanction letter no:
Ref.No./Math/Res/ Sept.2015/2015-16/918.
References
1. Karamardian, S.: Complementarity problems over cones with monotone and pseu-
domonotone maps. J. Optim. Theory Appl. 18, 445–454 (1976)
2. Rockafellar, R.T.: Characterization of the subdifferentials of convex functions.
Pacific J. Math. 17, 497–510 (1966)
3. Karamardian, S., Schaible, S.: Seven kinds of monotone maps. J. Optim. Theory
Appl. 66(1), 37–46 (1990)
4. Minty, G.J.: On the monotonicity of the gradient of a convex function. Pacific J.
Math. 14, 243–247 (1964)
5. Ye, M., He, Y.: A double projection method for solving variational inequalities
without monotonicity. Comput. Optim. Appl. 60(1), 141–150 (2015)
6. Kaul, R.N., Kaur, S.: Generalizations of convex and related functions. European
J. Oper. Res. 9(4), 369–377 (1982)
7. Delfour, M.C.: Introduction to optimization and semidifferential calculus. Society
for Industrial and Applied Mathematics (SIAM). Philadelphia (2012)
8. Penot, J.-P., Quang, P.H.: Generalized convexity of functions and generalized
monotonicity of set-valued maps. J. Optim. Theory Appl. 92(2), 343–356 (1997)
9. Komlósi, S.: Generalized monotonicity and generalized convexity. J. Optim. Theory
Appl. 84(2), 361–376 (1995)
10. Mangasarian, O.L.: Nonlinear Programming. McGraw-Hill Book Co., New York-
London-Sydney (1969)
11. Castellani, M., Pappalardo, M.: On the mean value theorem for semidifferentiable
functions. J. Global Optim. 46(4), 503–508 (2010)
12. Durdil, J.: On Hadamard differentiability. Comment. Math. Univ. Carolinae 14,
457–470 (1973)
13. Penot, J.-P.: Calcul sous-différentiel et optimisation. J. Funct. Anal. 27(2), 248–276
(1978)
14. Delfour, M.C., Zolésio, J.-P.: Shapes and geometries, Society for Industrial and
Applied Mathematics (SIAM). Philadelphia (2001)
15. Giannessi, F., Maugeri, A. (eds.): Variational Inequalities and Network Equilibrium
Problems. Plenum Press, New York (1995)
Parallel Multi-memetic Global Optimization
Algorithm for Optimal Control
of Polyarylenephthalide’s Thermally-
Stimulated Luminescence
Maxim Sakharov(&) and Anatoly Karpenko
Bauman MSTU, Moscow, Russia

max.sfn90@gmail.com
Abstract. This paper presents a modification of the parallel multi-memetic

global optimization algorithm based on the Mind Evolutionary Computation
algorithm which is designed for loosely coupled computing systems. The
algorithm implies a two-level adaptation strategy based on the proposed land-
scape analysis procedure and utilization of multi-memes. It is also consistent
with the architecture of loosely coupled computing systems due to the new static
load balancing procedure that allows to allocate more computational resources
for promising search domain’s sub-areas while maintaining approximately equal
load of computational nodes. The new algorithm and its software implementa-
tion were utilized to solve a computationally expensive optimal control problem
for a model of chemical reaction’s dynamic for thermally-stimulated lumines-
cence of polyarylenephtalides. Results of the numerical experiments are pre-
sented in this paper.
Keywords: Global optimization Parallel algorithms

Multi-memetic algorithms
1 Introduction
Many real-world global optimization problems are computationally expensive due to

the non-trivial landscape of an objective function and high dimension of a problem. To
cope with such problems within reasonable time, it is required to utilize parallel
computing systems. Nowadays, grid systems, made of heterogeneous personal com-
puters (desktop grids), are widely used for scientific computations [1]. Such systems
belong to a class of loosely coupled computing systems. Their popularity is caused by a
relatively low cost and simple scaling. On the other hand, desktop grids require
intermediate software to organize communication between computing nodes as well as
the task scheduling.
In general, real-world global optimization problems are frequently solved using
various population-based algorithms [2]. One of the main advantages of this class of
algorithms, apart from their simplicity of implementation, is a high probability of
localizing, so called, sub-optimal solutions, in other words, solutions that are close to
the global optimum. In many real-world optimization problems, such solutions are

https://doi.org/10.1007/978-3-030-21803-4_20
192 M. Sakharov and A. Karpenko
sufficient. However, the efficiency of population-based algorithms heavily depends on

the numeric values of their free parameters, which should be selected based on the
characteristics of a problem in hand.
It should also be noted, when one deals with the computationally expensive
objective functions, a number of evaluations becomes crucial. Additionally, empirical
studies suggest, the more information on a problem is included into an algorithm, the
better it operates [3]. However, it’s not always feasible to modify an algorithm or tune
it to every optimization problem. This is why, modern optimization techniques often
utilize preliminary analysis and preprocessing of a problem. This includes initial data
analysis, dimensionality reduction of a search domain, landscape analysis of the
objective function, etc. [4].
In [5] the authors proposed a two-level adaptation technique for population-based
algorithms, designed to extract information from an objective function prior to the
optimization process at the first level and provide adaptation capabilities for fine tuning
at the second level. The first level is based on the proposed landscape analysis (LA)
method which utilizes a concept of Lebesgue integrals and allows grouping objective
functions into three categories. Each category suggests a usage of specific values of the
basic algorithm’s free parameters. At the second level it was proposed to utilize multi-
memetic hybridization of the basic algorithm with suitable local search techniques.
Such an approach helps to adjust the algorithm to a specific problem while maintaining
some adaptation capabilities in cases when the LA procedure fails to determine all
objective’s function distinct features.
When solving an optimization problem on parallel computing systems in general,
and on loosely coupled systems in particular, one of the main difficulties is the optimal
mapping problem [6] – how to distribute groups of sub-problems over processors. It
should be noted that a problem of optimal mapping of computational processes onto a
parallel computing system is one of the main issues associated with parallel compu-
tations. It is well known that such a problem is NP-complete and can be solved with
exact methods within a very narrow class of problems [6]. Various methods of load
balancing are applied to obtain an approximate solution of the optimal mapping
problem. The main idea behind those methods is to distribute the computations over the
processors in such a way that the total computing and communication load is
approximately the same for each processor.
In [7] the authors proposed a static load balancing method for loosely coupled
systems that minimizes the number of interactions between computation nodes. The
static load balancing is based on the results of the LA procedure and helps to allocate
more computational resources to the promising search domain’s sub-areas. Such a tight
integration between the algorithm and the load balancing provides consistency of the
algorithm with the architecture of the computing system.
This work deals with the Simple Mind Evolutionary Computation (Simple MEC,
SMEC) algorithm [8]. It was selected for investigation because it is highly suitable for
parallel computations, especially for loosely coupled systems. In general, to be efficient
on loosely coupled systems, a basic optimization algorithm must imply a minimum
number of interactions between sub-populations which evolve on separate computing
nodes. Only a few currently known population-based algorithms, including the SMEC
algorithm, meet this requirement.
Parallel Multi-memetic Global Optimization Algorithm 193
This paper presents the modified parallel MEC algorithm with the incorporated LA
procedure and accompanied with the static load balancing method. An outline of the
algorithm as well as the brief description of its software implementation are described
in this paper. In addition, the computationally expensive optimal control problem of
polyarylenephthalide’s thermally-stimulated luminescence was studied in this work and
solved using the proposed technique.
2 Problem Statement and the SMEC Algorithm
In this paper we consider a deterministic global constrained minimization problem
min Uð X Þ ¼ UðX Þ ¼ U : ð1Þ

X2DRn
Here Uð X Þ is the scalar objective function, UðX Þ ¼ U is the required minimal value,
X ¼ ðx1 ; x2 ; . . .; xn Þ is the n-dimensional vector of variables, Rn is the n-dimensional
arithmetical space, D is the constrained search domain.
Initial values of vector X are generated within a domain D0 , which is defined as
follows

D0 ¼ X xmin
i xi xmax
i ; i 2 ½ 1 : n R :
n
In this work, the SMEC algorithm is considered as a basic algorithm. It belongs to a

class of MEC algorithms [9] inspired by a human society and simulate some aspects of
human behavior. An individual s is considered as an intelligent agent which operates in
a group S made of analogous individuals. During the evolution process every indi-
vidual is affected by other individuals within a group. This simulates the following
logic. In order to achieve a high position within a group, an individual has to learn from
the most successful individuals in this group. Groups themselves should follow the
same principle to stay alive in the intergroup competition. The detailed description of
the SMEC algorithm is presented in [10].
3 Parallel M3MEC Algorithm
This paper presents the new parallel Modified Multi-Memetic MEC (M3MEC) algo-
rithm. The SMEC algorithm is based on the three stages: initialization, similar taxis and
dissimilation [10]. In turn, the initialization stage of the M3MEC algorithm contains the
LA procedure which is based on the concept of Lebesgue integral [5, 11] and divides
objective function’s range space into levels based on the values Uð X Þ. This stage can
be described as follows.
1. Generate N quasi-random n-dimensional vectors within domain D0 . In this work
LPs sequence was used to generate quasi-random numbers since it provides a high-
quality coverage of a domain.
2. For every Xr ; r 2 ½1 : N calculate the corresponding values of the objective func-

tion Ur and sort those vectors in ascending order of values Ur ; r 2 ½1 : N .
3. Equally divide a set of vectors ðX1 ; X2 ; . . .; XN Þ into K sub-domains so that sub-
domain k1 contains the lowest values Uð X Þ.
4. For every sub-domain kl ; l 2 ½1 : K calculate a value of its diameter dl - a maxi-
mum Euclidian distance between any two individuals within this sub-domain
(Fig. 1).
a) Distribution of individuals for four b) Determining a diameter of the first

sub-populations sub-population
Fig. 1. Determining a diameter of the first sub-domain for the benchmark composition function
1 from CEC’14
5. Build a linear approximation for the dependency of diameter d on sub-domain

number l, using the least squares method.
6. Put the objective function Uð X Þ into one of three categories based on the calculated
values (Table 1). Each of three categories represents a certain topology of objective
function Uð X Þ.
Table 1. Classification of objective functions based on the LA results.

d ðlÞ increases d ðlÞ neither increases nor d ðlÞ decreases
decreases
Nested sub-domains with the Non-intersected sub- Distributed sub-domains
dense first domain (category domains of the same size with potential minima
I) (category II) (category III)
There are three possible cases for approximated dependency d ðlÞ: d can be an
increasing function of l; d can decrease as l grows; d ðlÞ can be neither decreasing nor
increasing. Within the scope of this work it is assumed that the latter scenario takes
place when a slope angle of the approximated line is within 5 . Each case represents a
certain set of the numeric values of M3MEC’s free parameters suggested based on the
numeric studies [10].
The similar taxis stage was modified in M3MEC in order to include meme selection
and local improvement stage. Meme selection is performed in accordance with the
simple random hyper-heuristic [12]. Once the most suitable meme is selected for a
specific sub-population it is applied randomly to a half of its individuals for kls ¼ 10
iterations. The dissimilation stage of SMEC was not modified in M3MEC. To handle
constraints of a search domain D the death penalty technique [2] was utilized during the
similar taxis stage.
In this work four local search methods were utilized, namely, Nelder-Mead method
[13], Hooke-Jeeves method [14], Monte-Carlo method [15], and Random Search on a
Sphere [16]. Only zero-order methods were used to deal with problems where the
objective function’s derivative is not available explicitly and its approximation is
computationally expensive.
The similar taxis and dissimilation stages are performed in parallel independently
for each sub-population. To map those sub-populations on the available computing
nodes the static load balancing was proposed by the authors [5] specifically for loosely
coupled systems.
We modify the initialization stage described above so that at the step 2 apart from
calculating the values of the objective function Ur , time required for those calculations
tr is also measured. The proposed adaptive load balancing method can be described as
follows.
1. For each sub-population Kl , l 2 ½1 : jK j we analyze all time measurements tr for the
corresponding vectors Xr ; r 2 ½1 : N=jK j whether there are outliers or not.
2. All found outliers t are excluded from sub-populations. A new sub-population is
composed from those outliers and it can be investigated by the user’s request after
the computational process is over.
3. All available computing nodes are sorted by their computation power then the first
sub-population K1 is sent to the first node.
4. Individuals in other sub-populations are re-distributed between neighboring sub-
populations starting from K2 so that the average calculation time would be
approximately the same for every sub-populations. Balanced sub-populations Kl ,
l 2 ½2 : jK j are then mapped onto the computational nodes.
The modified similar taxis stage along with the dissimilation stage are launched on
each node with the specific values of the free parameters in accordance with the results
of landscape analysis. Each computing node utilizes the stagnation of computational
process as a termination criterion while the algorithm in general works in a syn-
chronous mode so that the final result is calculated when all computing nodes com-
pleted their tasks.
4 Optimal Control of Polyarylenephthalide’s Thermally-

Stimulated Luminescence
The Modified Multi-Memetic MEC algorithm (M3MEC) along with the utilized memes
were implemented by the authors in Wolfram Mathematica. Software implementation
has a modular structure, which helps to modify algorithm easily and extend it with
additional assisting methods and memes. The proposed parallel algorithm and its
software implementation were used to solve an optimal control problem for the
thermally-stimulated luminescence of polyarylenephthalides (PAP).
Nowadays, organic polymer materials are widely used in the field of optoelec-
tronics. The polyarylenephthalides are high-molecular compositions that belong to a
class of unconjugated cardo polymers. PAPs exhibit good optical and electrophysical
characteristics along with the thermally-stimulated luminescence. Determining the
origins of PAP’s luminescent states is of both fundamental and practical importance
[17].
4.1 Thermally-Stimulated Luminescence of Polyarylenephthalides

Physical experiments [18] suggest that there are at least two types of stable reactive
species which are produced in PAP. These species have different activation energy
levels or in other words trap states. The dynamic model studied in this work was
proposed in the Institute of Petrochemistry and Catalysis of Russian Academy of
Science (IPC RAS) [18, 19]. It includes the following process: recombination of stable
ion-radicals ðy1 ; y2 Þ; repopulation of ion radical trap states; luminescence or, in other
words, deactivation of excited state y3 and emission of quantum of light y4 .
The model can be represented as follows:
8
>
> y01 ðtÞ ¼ 374 exp 10 10 69944 y 1 ðt Þ2 ;
>
> þ 8;31T ð t Þ
>
>
>
> y0 ðtÞ ¼ 396680 exp 10 101630
>
> y2 ðtÞ2 þ 1; 99 108 exp 10 10 21610 y3 ðtÞ;
>
>
2 10 þ 8;31T ð t Þ þ 8;31T ðt Þ
>
<
69944 101630
y03 ðtÞ ¼ 187 exp y 1 ð t Þ 2
þ 198340 exp y2 ðtÞ2
>
> 10 10 þ 8; 31T ðtÞ 10 10 þ 8; 31T ðtÞ
>
>
>
>
>
> 21610
>
> 21010 y ð t Þ 9; 98 10 7
exp y3 ðtÞ
>
>
3
10 10 þ 8; 31T ðtÞ
>
:
y04 ðtÞ ¼ 21010 y3 ðtÞ:
ð2Þ
Here y1 , y2 represent the initial stable species of various nature; y3 is some certain excited
state where y1 and y2 are headed to; y4 – quants of light. T ðtÞ is the reaction temperature.
Luminescence intensity I ðtÞ is being calculated according to the formula I ðtÞ ¼
544663240 y3 ðtÞ. Relative units are used for measuring I ðtÞ. Initial concentrations of the
species in the reaction equal y1 ð0Þ ¼ 300; y2 ð0Þ ¼ 1000; y3 ð0Þ ¼ y4 ð0Þ ¼ 0. The inte-
gration interval equals ½0; 2000 seconds.
4.2 Optimal Control Problem and Numerical Experiments

For the chemical reaction it is required to determine the law of variation of temperature
over time T ðtÞ, which would guarantee the desired law of variation of the thermos-
stimulated luminescence intensity I ðtÞ of PAP. The physical experimental setup for
chemical reaction imposes restrictions on the minimal and maximal values of the
temperature 298K T ðtÞ 460K.
The optimal control problem was transformed in this work to a global optimization
problem in the following manner. Integration interval ½0; 2000 is discretized so, that
the length of one section ½ti ; ti þ 1 meets the restrictions imposed by the experimental
setup on the velocity of the change in temperature T ðtÞ. The values of T ðti Þ, are the
components of vector X ¼ ðx0 . . .xn 1 Þ. The piecewise linear function was selected for
approximation of the function T ðtÞ. The following objective function was proposed in
this study:
Z
2000
J ðT ðtÞÞ ¼ Iref ðtÞ I ðT ðtÞÞ2 dt ! min : ð3Þ
T ðtÞ
0
The global minimization problem (3) was solved using the proposed M3MEC-
P algorithm and its software implementation. The following values of the algorithm’s
free parameters were utilized: the number of groups c ¼ 90; the number of individuals
in each group jSj ¼ 50; the stagnation iteration number kstop ¼ 100; tolerance used for
identifying stagnation was equal to e ¼ 10 5 . All computations were performed with a
use of desktop grid made of eight personal computers that didn’t communicate with
each other. The number of sub-populations jK j ¼ 8 was selected to be equal to the
number of computing nodes. In order to increase the probability of localizing global
optima, the multi-start method with 15 launches was used. The BDF integration
method was utilized at every evaluation to solve (2).
The first set of experiments was devoted to studying dynamics of the model (2)
under the constant values of the temperature in the reaction within the range of T ¼
150::215 C with the step size of 10 degrees (Fig. 2). Obtained results demonstrate that
under any constant temperature the luminescence intensity I ðtÞ decreases over time.
Furthermore, the higher the temperature, the faster the luminescence intensity
decreases.
The second set of experiments was devoted to maintaining the constant value of the
luminescence intensity I ðtÞ. The results obtained for the target value Iref ðtÞ ¼ 300 are
displayed in Fig. 3. They suggest that in order to maintain the constant value of I ðtÞ,
the reaction temperature has to increase similar to the linear law of variation (ap-
proximately by 1 C every 300 s). This implies the restriction on the reaction time as it
is impossible to maintain the constant growth of the temperature in the experimental
setup.
The third set of experiments was conducted to determine the law of variation of the
temperature T ðtÞ that would provide the required pulse changes in the luminescence
intensity I ðtÞ. Obtained results are presented in Fig. 4. The optimal control trajectory
repeats the required law of I ðtÞ variation with the addition of linear increasing trend.
Fig. 2. PAP’s luminescence intensity under various constant values of the temperature
T 2 ½150::215 C
161
160
159
T(t), °C
158
157
156
0 500 1000 1500 2000
t, s
a) Obtained and required con- b) Obtained optimal control

stant luminescence intensity temperature
Fig. 3. Optimal control for maintaining the constant value of PAP’s luminescence intensity
164
162
T(t), °C 160
158
156
0 500 1000 1500 2000
t, s
a) Obtained and required lumi- b) Obtained optimal control

nescence intensity with two pulses temperature
Fig. 4. Optimal control for providing pulse changes in the luminescence intensity
Figure 5 displays the results of numerical experiments, that were conducted in

order to determine the law of variation of the reaction’s temperature T ðtÞ, that would
provide harmonic oscillations of luminescence intensity I ðtÞ with the amplitude of 40
relative units and the oscillation period of approximately 200 s. Once again, the optimal
control trajectory repeats the required law of I ðtÞ variation with the addition of linear
increasing trend.
162
160
T(t), °C 158
156
154
0 500 1000 1500 2000
t, s
a) Obtained and required har-

b) Obtained optimal control
monic oscillations of luminescence
temperature
intensity
Fig. 5. Optimal control for providing harmonic oscillations of the luminescence intensity
5 Conclusions
This paper presents the new modified parallel population based global optimization
algorithm designed for loosely coupled systems and its software implementation. The
M3MEC-P algorithm is based on the adaptation strategy and the landscape analysis
procedure originally proposed by the authors and incorporated into the traditional
SMEC algorithm.
The algorithm is capable of adapting to various objective functions using both static
and dynamic adaptation. Static adaptation was implemented with a use of landscape
analysis, while dynamic adaptation was made possible by utilizing several memes. The
proposed landscape analysis is based on the concept of Lebesgue integral and allows
one to group objective functions into six categories. Each category suggests a usage of
the specific set of values for the algorithm’s free parameters. The proposed algorithm
and its software implementation were proved to be efficient when solving a real world
computationally expensive global optimization problem: determination of kinetics of
the thermally-stimulated luminescence of polyarylenephthalides.
Further research will be devoted to the study of asynchronous stopping criteria, as
well as the investigation of different architectures of loosely coupled systems.
Acknowledgments. This work was supported by the RFBR under a grant 18-07-00341.
References
1. Sakharov, M.K., Karpenko, A.P., Velisevich, Ya.I.: Multi-memetic mind evolutionary
computation algorithm for loosely coupled systems of desktop computers. In: Science and
Education of the Bauman MSTU, vol. 10, pp. 438–452 (2015). https://doi.org/10.7463/1015.
0814435
2. Karpenko, A.P.: Modern algorithms of search engine optimization. Nature-inspired
optimization algorithms. Moscow, Bauman MSTU Publ., p. 446 (2014)
3. Neri, F., Cotta, C., Moscato, P.: Handbook of Memetic Algorithms, pp. 368. Springer, Berlin
(2011). https://doi.org/10.1007/978-3-642-23247-3
4. Mersmann, O. et al.: Exploratory landscape analysis. In: Proceedings of the 13th Annual
Conference on Genetic and Evolutionary Computation. ACM, pp. 829–836. (2011). https://
doi.org/10.1145/2001576.2001690
5. Sakharov, M., Karpenko, A.: Multi-memetic mind evolutionary computation algorithm
based on the landscape analysis. In: Theory and Practice of Natural Computing. 7th
International Conference, TPNC 2018, Dublin, Ireland, 12–14 Dec 2018, Proceedings,
pp. 238–249. Springer (2018). https://doi.org/10.1007/978-3-030-04070-3
6. Voevodin, V.V., Voevodin, Vl. V.: Parallel Computations, p. 608. BHV-Peterburg, SPb.
(2004)
7. Sakharov, M.K., Karpenko, A. P.: Adaptive load balancing in the modified mind
evolutionary computation algorithm. In: Supercomputing Frontiers and Innovations, 5(4),
5–14 (2018). https://doi.org/10.14529/jsfi180401
8. Jie, J., Zeng, J.: Improved mind evolutionary computation for optimizations. In: Proceedings
of 5th World Congress on Intelligent Control and Automation, Hang Zhou, China, pp. 2200–
2204 (2004). https://doi.org/10.1109/WCICA.2004.1341978
9. Chengyi, S., Yan, S., Wanzhen, W.: A Survey of MEC: 1998-2001. In: 2002 IEEE
International Conference on Systems, Man and Cybernetics IEEE SMC2002, Hammamet,
Tunisia. October 6–9. Institute of Electrical and Electronics Engineers Inc., vol. 6, pp. 445–
453 (2002). https://doi.org/10.1109/ICSMC.2002.1175629
10. Sakharov, M., Karpenko, A.: Performance investigation of mind evolutionary computation
algorithm and some of its modifications. In: Proceedings of the First International Scientific
Conference “Intelligent Information Technologies for Industry” (IITI’16), pp. 475–486.
Springer (2016). https://doi.org/10.1007/978-3-319-33609-1_43
11. Sakharov, M., Karpenko, A.: A new way of decomposing search domain in a global
optimization problem. In: Proceedings of the Second International Scientific Conference
“Intelligent Information Technologies for Industry” (IITI’17), pp. 398–407. Springer (2018).
https://doi.org/10.1007/978-3-319-68321-8_41
12. Ong, Y.S., Lim, M.H., Zhu, N., Wong, K.W.: Classification of adaptive memetic algorithms:
a comparative study. In: IEEE Transactions on Systems, Man, and Cybernetics, Part B:
Cybernetics, pp. 141–152 (2006)
13. Nelder, J.A., Meade, R.: A Simplex method for function minimization. Comput. J. 7, 308–
313 (1965)
14. Karpenko, A.P.: Optimization Methods (Introductory Course), http://bigor.bmstu.ru/.
Accessed 25 Mar 2019
15. Sokolov, A.P., Pershin, A.Y.: Computer-aided design of composite materials using
reversible multiscale homogenization and graph-based software engineering. Key Eng.
Mater. 779, 11–18 (2018). https://doi.org/10.4028/www.scientific.net/KEM.779.11
16. Agasiev, T., Karpenko, A.: The program system for automated parameter tuning of
optimization algorithms. Proc. Comput. Sci. 103, 347–354 (2017). https://doi.org/10.1016/j.
procs.2017.01.120
17. Antipin, V.A., Shishlov, N.M., Khursan, S.L.: Photoluminescence of polyarylenephthalides.
VI. DFT study of charge separation process during polymer photoexcitation. Bulletin of
Bashkir University, vol. 20, Issue 1, pp. 30–42 (2015)
18. Akhmetshina, L.R., Mambetova, Z.I., Ovchinnikov, M.Y.: Mathematical modeling of
thermoluminescence kinetics of polyarylenephthalides. In: V International Scientific
Conference on Mathematical Modeling of Processes and Systems, pp. 79–83 (2016)
19. Antipin, V.A., Mamykin, D.A., Kazakov, V.P.: Recombination luminescence of poly
(arylene phthalide) films induced by visible light. High Energy Chem. 45(4), 352–359
(2011)
Proper Choice of Control Parameters for
CoDE Algorithm
Petr Bujok(B) , Daniela Einšpiglová, and Hana Zámečnı́ková
University of Ostrava, 30. Dubna 22, 70200 Ostrava, Czech Republic

{petr.bujok,daniela.einspiglova,hana.zamecnikova}@osu.cz
Abstract. An adaptive variant of CoDE algorithm uses three couples

of settings of two control parameters. These combinations provide well
performance when solving a various type of optimisation problems. The
aim of the paper is to replace original values of control parameters in
CoDE to achieve better efficiency in real-world problems. Two different
variants of enhanced CoDE algorithm are proposed and compared with
the original CoDE variant. The new combinations of F and CR parame-
ters are selected from results provided in a preliminary study where 441
various combinations of these parameters were evaluated. The results
show that newly proposed CoDE variants (CoDEFCR1 and CoDEFCR2 )
perform better than the original CoDE in most of 22 real-world problems.
Keywords: Global optimisation · Differential evolution · Control

parameters · CoDE · Real-world problems · Experimental comparison
1 Introduction
A proper setting of control parameters in Differential evolution (DE) algorithm
plays an important role when solving various optimisation problems. More pre-
cisely, there is no one proper setting which performs the best on most of the prob-
lems (No-Free-Lunch theorem [16]). Although there is a lot of various approaches
for adapting values of DE parameters, no one is able to be the best.
This paper is focused on a more proper setting of two control parameters of
the DE algorithm based on preliminary work. Our preliminary comprehensive
experiment provides very interesting results of the DE algorithm solving real-
world problems. A lot of combinations of two DE control parameters are studied
and evaluated by selected problems to order them from more efficient to more
inefficient.
Although DE has only few control parameters, the efficiency is very sensitive
especially to the control parameter setting of F and CR values. Unfortunately,
simple trial-and-error tuning of the parameters requires a lot of time. Several
authors recommended the setting of DE control parameters [7,9,10], unfortu-
nately, these values are valid only for a part of optimisation problems. This fact
results in a lot of adaptive mechanisms controlling values of F and CR were
proposed e.g. [1,11,12,14,15]. The summary of DE research has been presented
https://doi.org/10.1007/978-3-030-21803-4_21
Proper Choice of Control Parameters for CoDE Algorithm 203
recently in the several comprehensive papers [4,6,8]. One of the well-performing

adaptive DE algorithms is called CoDE [14], and in this paper, the CoDE algo-
rithm was selected to enhance settings of the control parameters. This algorithm
differs from other DE variants by using of three various triplets of control param-
eters.
Three triplets of original CoDE setting are replaced by two other triplets
which achieve a substantial efficiency in our preliminary experiments. The orig-
inal and two newly proposed variants of CoDE algorithm are compared on a set
of real-world optimisation problems CEC 2011 [5]. The main goal of this paper
is to show if better performing setting of DE control parameters increases the
efficiency of the CoDE algorithm substantially.
In the global optimisation, problems are represented by an objective function
f (x), x = (x1 , x2 , . . . , xD ) ∈ IRD defined on the search domain Ω limited by
D
lower and upper boundaries, i.e. Ω = j=1 [aj , bj ], aj < bj , j = 1, 2, . . . , D.
The solution of the problem is the global minimum point x∗ , which satisfies
condition f (x∗ ) ≤ f (x), ∀x ∈ Ω.
The rest of the paper is organised as follows. Section 2 shows a description
of the adaptive DE algorithm used in the experiments. A brief report of the
experimental study regarding DE control parameters is given in Sect. 3. Newly
proposed variants of adaptive DE algorithms are described in Sect. 4. Experi-
mental setting and methods applied to statistical assessment are described in
Sect. 5. Experimental results on real-world optimisation problems are presented
in Sect. 6. Section 7 brings the conclusion of the paper with some final remarks.
2 Adaptive Variant of CoDE

In this experiment we use DE algorithm with composite trial vector generation
strategies and control parameters (called CoDE) presented by Wang et al. in
2011 [14]. Authors of CoDE algorithm compare this algorithm with four adaptive
DE variants (jDE, SaDE, JADE, EPSDE) and the results showed that CoDE is
at least competitive with the algorithms in the comparison.
In the CoDE, three well-studied trial vector strategies with three control
parameter settings are randomly combined to generate trial vectors. The strate-
gies are rand/1/bin, rand/2/bin, and current-to-rand/1. Therefore, three differ-
ent offspring vectors are generated and the vector with the least function value
from this triplet is selected as the trial vector. This mechanism promises faster
convergence because the best ‘possible’ solution is preferred. On the other hand,
for each solution, three vectors are evaluated by the objective function which
decreases time to search (the time is measured by the number of function eval-
uations). The values of control parameters F and CR are also chosen randomly
from the parameter pools containing [F = 1.0, CR = 0.1], [F = 1.0, CR = 0.9],
and [F = 0.8, CR = 0.2].
After the current-to-rand/1 mutation, no crossover is applied because it
includes so-called arithmetic crossover making this strategy rotation invariant.
204 P. Bujok et al.
Although CoDE is not very often used in real applications, results of the fol-
lowing experiments show that there are some problems in which the DE variant
performs better compare to other algorithms [2,3].
3 Efficiency of DE Control Parameters

A lot of different approaches to adapt values of F and CR during a search process
in DE algorithm were studied. Several widely-used adaptive DE are described
and partially compared in [4,6,13]. Although many of adaptive DE algorithms
are very efficient, the values of F and CR control parameters are usually sampled
from interval [0, 1].
It is clear that although one combination of F , CR values provides the best
results for one optimisation problem, the same setting performs poorly in another
task. When a relative wide sampling interval of control parameters is used, it
promises a possibility to use also some efficient settings. On the other hand, a
relatively big amount of unsuccessful F , CR values is also sampled which can
cause a slow convergence of the DE algorithm.
This fact was the main inspiration for the experiment where usual sampling
interval of DE control parameters is equidistantly divided to get a lot of combi-
nations of F , CR values. All the combinations were evaluated by real-world prob-
lems and obtained results are statistically assessed in some conclusions. In our
experimental study, the same equidistant values with step 0.05 are used. It means
that 21 same values {0, 0.05, 0.1, 0.15, . . . , 0.95, 1} are used for each parameter.
Necessary to note, the most frequently used mutation variant rand/1 and bino-
mial crossover were selected in this experiment. It gives totally 21 × 21 = 441
combinations of F and CR setting. Each setting was evaluated by 22 real-world
CEC 2011 optimisation problems, details of this set are provided in Sect. 5.
All predefined combinations are statistically assessed using the non-
parametric Friedman test. It provides global insight into the performance of
classic DE variant where various control parameters settings are used. The test
was carried out on medians of minimal function values at the end of the search.
The null hypothesis on equivalent efficiency of the settings was rejected with
p < 5 × 10−6 . Each combination of F and CR is evaluated by a mean-rank
value which represents the overall performance including all selected problems.
Afterwards, lower mean-rank values represent DE settings which provide better
results in overall problems.
The plot representing the efficiency of all 441 combinations of F and CR
on all 22 real-world problems is shown in Fig. 1. The most efficient combination
(the least mean-rank value) is represented by a black square and the worst
efficient setting (the biggest mean-rank value) is illustrated by a white square,
respectively. We can see that there is an interesting continuous dark area in
which the best combination F = 0.45 and CR = 0.95 is located. Conversely, the
worst performance is provided by a combination of F = 0 and CR = 1. This fact
is caused by a zero diversity step (F = 0) and high propagation of such solution
(CR = 1). There are two bright regions where the efficiency of DE setting is
rather worse, especially CR is close to 1, and F is rather zero.
Fig. 1. Mean ranks from Friedman test and 441 F , CR settings and all problems.
4 Proposed Variants of CoDE Algorithm
The preliminary experiment with many combinations of F and CR parameters

shows that three original combinations of DE parameters used in CoDE are not
very efficient. The rank of [F = 1, CR = 0.1] is 221, combination of [F = 1, CR =
0.9] is on 418th position, and the last setting of [F = 0.8, CR = 0.2] has rank 286.
The aim is to select another three combinations of F and CR achieved better
overall results. The results were obtained only with classical DE and rand/1/bin.
CoDE is based on a rand approach (rand/1, rand/2, and current-to-rand), so
similarity with classical DE is supposed.
In this paper, two different well-performing settings of DE parameters are
selected to replace the original values in the adaptive CoDE algorithm. For sim-
plicity, the newly proposed algorithms are labelled CoDEFCR1 and CoDEFCR2 ,
and detail description of both variants are in the following paragraphs.
4.1 CoDEFCR1 : Better Average Performing Setting
The original CoDE algorithm uses three combinations of F and CR settings

which achieve absolute ranks 221, 286, and 418, i.e. estimated average mean
rank of CoDE setting is approximately 308.
In the first proposed enhanced CoDEFCR1 variant, three different couples of
control parameters with good efficiency are selected to replace the original set-
tings. The best results provides [F = 0.45, CR = 0.95] (absolute rank 1), combi-
nation of [F = 0.95, CR = 1] achieves absolute rank 8, and combination of [F =
0.05, CR = 0.05] is on 20th position. These settings achieve average rank 10 in
preliminary experiment. Substantially better-performing combinations of control
206 P. Bujok et al.
parameters are used in CoDE to increase performance when The remaining set-
ting of CoDEFCR1 is the same as the setting of the original CoDE algorithm.
4.2 CoDEFCR2 : Worse Average Performing Setting

The best performing setting used in previous enhance CODE algorithm ([F =
0.45, CR = 0.95]) is replaced by a combination [F = 0.3, CR = 0.8] which
achieves the 46th position in the preliminary experiment. Other two settings
remain the same–combination of [F = 0.95, CR = 1] is on 8th position, and the
[F = 0.05, CR = 0.05] is on 20th position. The estimated average mean rank of
the settings of CoDEFCR2 is 18. All remaining parameters of this algorithm are
set according to the original CoDE variant.
5 Experimental Settings
The main aim of this study is to increase the efficiency of CoDE algorithm on
real-world problems. Therefore, the test suite of 22 real-world problems selected
for CEC 2011 competition in Special Session on Real-Parameter Numerical Opti-
mization [5] is used as a benchmark in the experimental comparison. The func-
tions in the benchmark differ in the computational complexity and in the dimen-
sion of the search space which varies from D = 1 to D = 240.
For each algorithm and problem, 25 independent runs were carried out. The
run of the algorithm stops if the prescribed number of function evaluations
MaxFES = 150000 is reached. The partial results of the algorithms after reach-
ing one third and two-thirds of MaxFES were also recorded for further analysis.
The point in the terminal population with the smallest function value is the
solution of the problem found in the run. The minimal function values of the
problems are unknown, the algorithm providing lower function value is better
performing.
The experiments in this paper can be divided into two parts. In the first part,
a classic DE algorithm with strategy rand/1/bin and 441 different combinations
of F and CR setting was studied, as mentioned in Sect. 3. The only control
parameter is population size, and it is set N = 100.
In the second part of the experiment, the original CoDE algorithm and two
newly proposed enhanced CoDE variants (CoDEFCR1 , CoDEFCR2 ) were applied
to the set of 22 real-world problems. The names of the newly proposed variants
are abbreviated to FCR1 (CoDEFCR1 ) and FCR2 (CoDEFCR2 ) in some parts
of the results. The other control parameters are set up according to the recom-
mendation of authors in the original paper. All the algorithms are implemented
in Matlab 2017b, and all computations were carried out on a standard PC with
Windows 7, Intel(R) Core(TM)i7-4790 CPU 3.6 GHz, 16 GB RAM.
6 Results
The original CoDE algorithm and two newly proposed enhanced CoDE variants
are compared on 22 real-world problems. Global insight into overall algorithms
performance provides Friedman statistical test applied to the median values of

each problem and algorithm. The zero hypothesis about equality of algorithms’
results was rejected in each stage of the run, where the significance level was set
at 1×10−3 . For better illustration, the mean ranks of three compared algorithms
are illustrated in Fig. 2.
We can see that the original CoDE algorithm performs substantially worse
compared to the newly proposed variants. The performance of CoDEFCR1 is bet-
ter than the original CoDE, the difference between the algorithms is decreased
with increased number of function evaluations. On the other hand, the sec-
ond proposed variant CoDEFCR2 performs the best, and the efficiency of this
algorithm is rather invariant with increasing function evaluations. The average
better-performing combinations of F and CR used in CoDEFCR1 cause better
results compare to the original CoDE algorithm. It is surprising that average
worse performing combinations of F and CR (when the best combination is
replaced by a substantially worse performing one) used in CoDEFCR2 achieve
the best performance overall real-world problems.
Fig. 2. Mean rank values for three CoDE variants in three stages of the search from
Friedman test.
More detailed results from the comparison of three CoDE variants provide
non-parametric Kruskal-Wallis test with Dunn’s multiple comparison methods.
This test is applied on each problem separately to show which settings of control
parameters in CoDE is more efficient. The zero hypothesis about algorithms’
performance was rejected in most of problems at significance level 1 × 10−4 . The
Table 1. Median values of three CoDE variants and Results of Kruskal-Wallis tests.
208
CoDE CoDEFCR1 CoDEFCR2 Kruskal-Wallis

Fun D Best Median Best Median Best Median Best Worst
T01 6 0.57406 2.69825 0 0 0 8.81E-29 FCR1,FCR2 CoDE
T02 30 −15.366 −11.6394 −22.1027 −20.1225 −22.7103 −20.2401 FCR1,FCR2 CoDE
T03 1 1.15E-05 1.15E-05 1.15E-05 1.15E-05 1.15E-05 1.15E-05 No difference
P. Bujok et al.
T04 1 0 0 0 0 0 0 No difference
T05 30 −33.5321 −29.8468 −33.2764 −32.2624 −33.5565 −32.3392 FCR2,FCR1 CoDE
T06 30 −26.5649 −21.6885 −28.0248 −25.9769 −28.1222 −26.0169 FCR2,FCR1 CoDE
T07 20 1.20988 1.39838 1.04545 1.40338 1.18516 1.36425 No difference
T08 7 220 220 220 220 220 220 No difference
T09 126 1823.74 2719.55 1323.17 2128.9 1188.79 1746.66 FCR2,FCR1 CoDE
T10 12 −21.8285 −21.5793 −21.8425 −21.6437 −21.8425 −21.6437 FCR1,FCR2 CoDE
T11.1 120 51855.2 53034.7 51439.1 52220.5 51299.7 52179.2 FCR2,FCR1 CoDE
T11.2 96 1070680 1074570 1084660 1130970 1069540 1113210 CoDE FCR2,FCR1
T11.3 240 15444.2 15444.2 15444.2 15444.2 15444.2 15444.2 No difference
T11.4 6 18263.1 18372.8 18075.3 18085.5 18019.2 18080.9 FCR2,FCR1 CoDE
T11.5 13 32797.9 32847.6 32715.2 32766.3 32729.9 32768.2 FCR1,FCR2 CoDE
T11.6 15 124765 126703 126989 128813 126883 128564 CoDE FCR2,FCR1
T11.7 40 1868890 1892380 1892840 1908440 1874920 1901360 CoDE,FCR2 FCR1
T11.8 140 934946 937003 935498 940008 934834 939806 CoDE FCR2,FCR1
T11.9 96 940282 960844 944549 968972 941807 983218 No difference
T11.10 96 934483 936501 933303 939932 935862 938568 CoDE FCR2,FCR1
T13 26 17.263 20.4819 13.555 18.0592 11.6715 18.2202 FCR2,FCR1 CoDE
T14 22 16.3306 21.14 8.93319 15.5489 8.73351 15.3736 FCR1,FCR2 CoDE
minimum and median values of 25 runs for each variant along with the best and
the worst performing variant for each problem are figured in Table 1.
In the ‘best’ column, there are the significantly better performing variants,
and in the ‘worst’ column, the variants providing the worst efficiency are located.
It is clear, the most frequently worst performing variant is the original CoDE
algorithm which loses in 12 out of 22 problems. On the other hand, both newly
proposed CoDE variants win in 12 problems out of 22, variant CoDEFCR2 out-
performs the CoDEFCR1 significantly only in one problem. In six problems out
of 22, all CoDE variants perform similarly. In the left side of the table, the least
median value of each problem is printed bold and underlined. The original CoDE
provides the best median value in 7 out of 22 problems, CoDEFCR1 variant is best
performing in 4 out of 22 problems, and CoDEFCR2 achieves the least median
in 9 out of 22 problems.
Table 2. Average frequencies of use of three CoDE variants over 22 real-world prob-
lems.
CoDE rand/1 rand/2 current-to-rand/1

F 1 1 0.8 1 1 0.8 1 1 0.8
CR 0.1 0.9 0.2 0.1 0.9 0.2 0.1 0.9 0.2
Avg 11.8 10.3 18.8 3.4 2.1 19.2 6.2 5.4 22.9
CoDEFCR1 rand/1 rand/2 current-to-rand/1
F 0.05 0.45 0.95 0.05 0.45 0.95 0.05 0.45 0.95
CR 0.05 0.95 1 0.05 0.95 1 0.05 0.95 1
Avg 12.2 10.1 26.3 6.3 3.9 18.8 5 3.5 13.9
CoDEFCR2 rand/1 rand/2 current-to-rand/1
F 0.05 0.3 0.95 0.05 0.3 0.95 0.05 0.3 0.95
CR 0.05 0.8 1 0.05 0.8 1 0.05 0.8 1
Avg 11.9 9.8 25.6 5.1 3.2 21.4 5.5 3.9 13.8
Furthermore, frequencies of use of all nine settings in each of three CoDE

variants are studied. Three compared CoDE variants use the same three strate-
gies (rand/1/bin, rand/2/bin, and current-to-rand/1) which are combined with
three couples of F and CR. The average frequencies of 25 runs for each prob-
lem are computed and illustrated in Fig. 3. The problems (horizontal axis) are
sorted by the level of dimension. The problems with the same dimension are
ordered based on their names (for example first is the problem T03 and second
is T04). The same types of lines are used for the same strategies. There is no
big difference between plots of CoDEFCR1 and CoDEFCR2 . Although different
combinations of parameters are used (F = 0.45 and CR = 0.95, and F = 0.3
and CR = 0.8), both the settings are used with similar frequencies. The only
visible difference is observed in strategy rand/2/bin with F = 0.95 and CR = 1,
210 P. Bujok et al.
Fig. 3. Frequencies of strategies in CoDE, CoDEFCR1 , and CoDEFCR2 for 22 real-world

problems (rand/1/bin–dot line, rand/2/bin–dash line, current-to-rand/1–solid line).
the frequency of this setting is used more frequently in CoDEFCR2 . In the orig-
inal CoDE variant, the settings with F = 0.8 and CR = 0.2 are used similarly
frequently as the combinations F = 0.95 and CR = 1 in new variants, the curves
are in CoDE more similar.
7 Conclusion
Based on the experimental results, it is obvious that the original CoDE algorithm
performs substantially worse compared to the newly proposed variants. The per-
formance of CoDEFCR1 is better than the original CoDE, the difference between
the algorithms is decreased with increased number of function evaluations. The
second proposed CoDEFCR2 performs best, the efficiency is rather invariant with
increasing function evaluations. The worse average performing combinations of
F and CR used in CoDEFCR2 achieve the best overall performance in real-world
problems. The most frequently worst performing variant is the original CoDE
algorithm which loses in 12 out of 22 problems, both newly proposed CoDE
variants win in 12 problems out of 22. CoDEFCR2 outperforms the CoDEFCR1
significantly only in one problem. Proposed variants of CoDE algorithm are able
to outperform the original CoDE. It is interesting that rather worse settings
used in CoDEFCR2 achieve slightly better results than CoDEFCR1 with the best
combination of F and CR. The proposed methods outperform the winner of
CEC 2011 competition (GA-MPC) in 7 out of 22 problems. More proper combi-
nations of control parameters in adaptive DE variants will be studied in further
research.
References
1. Brest, J., Maučec, M.S., Bošković, B.: Single objective real-parameter optimization:
algorithm jSO. In: 2017 IEEE Congress on Evolutionary Computation (CEC), pp.
1311–1318 (2017)
2. Bujok, P.: Migration model of adaptive differential evolution applied to real-world
problems. Artificial Intelligence and Soft Computing–Part I. Lecture Notes in Com-
puter Science, vol. 10841, pp. 313–322. In: 17th International Conference on Arti-
ficial Intelligence and Soft Computing ICAISC. Zakopane, Poland (2018)
3. Bujok, P., Tvrdı́k, J., Poláková, R.: Differential evolution with exponential
crossover revisited. In: Matoušek, R. (ed.) MENDEL, 22nd International Con-
ference on Soft Computing, pp. 17–24. Czech Republic, Brno (2016)
4. Das, S., Mullick, S.S., Suganthan, P.N.: Recent advances in differential evolution-an
updated survey. Swarm Evol. Comput. 27, 1–30 (2016)
5. Das, S., Suganthan, P.N.: Problem Definitions and Evaluation Criteria for CEC
2011 Competition on Testing Evolutionary Algorithms on Real World Optimiza-
tion Problems. Tech. rep. Jadavpur University, India and Nanyang Technological
University, Singapore (2010)
6. Das, S., Suganthan, P.N.: Differential evolution: a survey of the state-of-the-art.
IEEE Trans. Evol. Comput. 15, 27–54 (2011)
7. Feoktistov, V.: Differential Evolution in Search of Sotution. Springer (2006)
8. Neri, F., Tirronen, V.: Recent advances in differential evolution: a survey and
experimental analysis. Artif. Intell. Rev. 33, 61–106 (2010)
9. Price, K.V., Storn, R., Lampinen, J.: Differential Evolution: A Practical Approach
to Global Optimization. Springer (2005)
10. Storn, R., Price, K.V.: Differential evolution–a simple and efficient heuristic for
global optimization over continuous spaces. J. Glob. Optim. 11, 341–359 (1997)
212 P. Bujok et al.
11. Tang, L., Dong, Y., Liu, J.: Differential evolution with an individual-dependent
mechanism. IEEE Trans. Evol. Comput. 19(4), 560–574 (2015)
12. Tvrdı́k, J.: Competitive differential evolution. In: Matoušek, R., Ošmera, P. (eds.)
MENDEL 2006, 12th International Conference on Soft Computing, pp. 7–12. Uni-
versity of Technology, Brno (2006)
13. Tvrdı́k, J., Poláková, R., Veselský, J., Bujok, P.: Adaptive variants of differential
evolution: towards control-parameter-free optimizers. In: Zelinka, I., Snášel, V.,
Abraham, A. (eds.) Handbook of Optimization–From Classical to Modern App-
roach. Intellingent Systems Reference Library, vol. 38, pp. 423–449. Springer, Berlin
Heidelberg (2012)
14. Wang, Y., Cai, Z., Zhang, Q.: Differential evolution with composite trial vector
generation strategies and control parameters. IEEE Trans. Evol. Comput. 15, 55–
66 (2011)
15. Wang, Y., Li, H.X., Huang, T., Li, L.: Differential evolution based on covariance
matrix learning and bimodal distribution parameter setting. Appl. Soft Comput.
18, 232–247 (2014)
16. Wolpert, D.H., Macready, W.G.: No free lunch theorems for optimization. IEEE
Trans. Evol. Comput. 1, 67–82 (1997)
Semidefinite Programming Based Convex
Relaxation for Nonconvex Quadratically
Constrained Quadratic Programming
Rujun Jiang1 and Duan Li2(B)

1
School of Data Science, Fudan University, Shanghai, China
rjjiang@fudan.edu.cn
2
School of Data Science, City University of Hong Kong, Hong Kong, China
dli226@cityu.edu.hk
Abstract. In this paper, we review recent development in semidefinite

programming (SDP) based convex relaxations for nonconvex quadrat-
ically constrained quadratic programming (QCQP) problems. QCQP
problems have been well known as NP-hard nonconvex problems. We
focus on convex relaxations of QCQP, which forms the base of global
algorithms for solving QCQP. We review SDP relaxations, reformulation-
linearization technique, SOC-RLT constraints and various other tech-
niques based on lifting and linearization.
1 Introduction
We consider in this survey paper the following class of quadratically constrained
quadratic programming (QCQP) problems:
(P) min xT Q0 x + cT0 x

s.t. xT Qi x + cTi x + di ≤ 0, i = 1, . . . , l,
aTj x ≤ bj , j = 1, . . . , m,
where Qi is an n × n symmetric matrix, ci ∈ Rn , i = 0 . . . , l, di ∈ R, i = 1 . . . , l

and aj ∈ Rn , bj ∈ R, j = 1, . . . , m.
QCQP is in general NP-hard [14,18], although some special cases of QCQP
are polynomially solvable [3–6,17]. A global optimal solution of QCQP is gener-
ally hard. Branch and bound methods have been developed in the literature to
find exact solutions for QCQP problems [8,13], whose efficiency depend on two
major factors: the quality of the relaxation bound and its associated computa-
tional cost. This paper focuses on a review on various semidefinite programming
(SDP) based convex relaxations strengthened with different valid inequalities
for QCQP problems. We will also point out that for several special problems,
Supported by Shanghai Sailing Program 18YF1401700, Natural Science Foundation
of China (NSFC) 11801087 and Hong Kong Research Grants Council under Grants
14213716 and 14202017.
https://doi.org/10.1007/978-3-030-21803-4_22
SDP relaxations enhanced with valid inequalities may be tight for the original
problems, i.e., there exists a rank one SDP solution and an optimal solution of
the original problem can be recovered from the SDP solution.
The remaining of this paper is organized as follows. In Sect. 2, we review
various valid inequalities to strengthen the basic SDP relaxation. We conclude
the paper in Sect. 3.
Notations We use v(·) to denote the optimal
√ value of problem (·). Let x
T
denote the Euclidean norm of x, i.e., x= x x, and AF denote the Frobe-
nius norm of a matrix A, i.e., AF = tr(AT A). The notation A 0 refers
that matrix A is a positive semidefinite and symmetric square matrix and the
notation A B for matrices A and B implies that A − B 0 and both A and
B are symmetric.
The inner product of two symmetric matrices is defined by
A · B = i,j=1,...,n Aij Bij , where Aij and Bij are the (i, j) entries of A and
B, respectively. We also use Ai,· and A·,i to denote the ith row and column of
matrix A, respectively. For a positive semidefinite n × n matrix A with spectral
decomposition A = U T DU , where D is a n × n diagonal matrix and U is an
1 1 1
T
n × n orthogonal matrix, √ we use notation A to denote U D U , where D is
2 2 2
a diagonal matrix with Dii being its ith entry.
2 Convex Relaxations Based only on Constraints
In this section, we review the basic SDP relaxation for problem (P) and its
strengthened variants with RLT, SOC-RLT, GSRT and other valid inequalities.
T T T
X T=xx and relaxing X = xx to X xx , which
By lifting x to matrix
1x
is further equivalent to 0 due to the Schur complement, we have the
x X
following basic SDP relaxation for problem (P):
(SDP) min Q0 · X + cT0 x

s.t. Qi · X + cTi x + di ≤ 0, i = 1, . . . , l, (1)
aTj x ≤ bj , j = 1, . . . , m, (2)

1 xT
0, (3)
x X
where Qi · X = trace(Qi X) is the inner product of matrices Qi and X. Direct

SDP relaxations are often loose, except for some special case, e.g., there is only
one quadratic [17] and homogeneous quadratic problem with two quadratic con-
straints [19]. The optimal solution of the SDP relaxation can also be used to
generate a feasible solution for the original QCQP problem. One of the most
famous case is the max cut problem, where a randomization solution is expected
to have an objective value that is at least 0.878 of the original objective value
[11]. Further investigation shows that the SDP relaxation is the conic dual of
SDP Based Convex Relaxation for Nonconvex QCQP 215
the Lagrangian dual of problem (P),
(L) max τ
l
m
aj

Q0 c0 Qi ci 0 2
s.t. cT
2
− λi cT
2
− μj aT 0,
0
2 −τ i
i
2 di i
j
2 −bj
λi ≥ 0, i = 1, . . . , l, μj ≥ 0, j = 1, . . . , m,
also known as the Shor’s relaxation [16]. The strong duality holds for (SDP)
when (SDP) is bounded from below and Slater condition holds for (SDP).
We next review valid inequalities that have been considered to strengthen
(SDP) in the literature. Sherali and Adams [15] first introduced the concept of
“reformulation-linearization technique” (RLT) to formulate a linear program-
ming relaxation for problem (P). The RLT [15] linearizes the product of any
pair of linear constraints, i.e.,
(bi − aTi x)(bj − aTj x) = bi bj − (bj aTi + bi aTj )x + aTi xxT aj ≥ 0.
A tighter relaxation for problem (P) can be obtained via enhancing (SDP) relax-
ation with the RLT constraints:
(SDPRLT ) min Q0 · X + cT0 x

s.t. (1), (2), (3),
ai aTj · X + bi bj − bj aTi x − bi aTj x ≥ 0, ∀1 ≤ i < j ≤ m. (4)
Anstreicher in [1] proposed a theoretical analysis for successfully applying RLT

constraints to remove a large portion of the feasible region for the relaxation,
and suggested that a combination of SDP and RLT constraints leads to a tighter
bound.
From now on, we assume that Qi is not a zero matrix for i = 1, . . . , l. We
further partition the quadratic constraints into the following two groups:
C = {i : Qi is positive semidefinite, i = 1, . . . , l},

N = {i : Qi is not positive semidefinite, i = 1, . . . , l},
and denote k (k ≤ l) as the cardinality of C. Sturm and Zhang [17] showed

that combining the so-called SOC-RLT constraints and basic SDP relaxation
can solve the problem of minimizing a quadratic objective function subject to
a convex quadratic constraint and a linear constraint exactly. More specifically,
they rewrote a convex quadratic constraint as a second order cone (SOC) con-
straint and linearized the product of the SOC and linear constraints. It has been
shown in [7] and [17] that SOC-RLT constraints can be used to strengthen the
convex relaxation (SDPRLT ) for general QCQP problems. To obtain the SOC-
RLT valid inequality, we need first decompose a positive semidefinite matrix Qi
as Qi = BiT Bi , i ∈ C and rewrite the convex quadratic constraint in an SOC
form, i.e.,

xT Qi x ≤ −di − cTi x ⇒ −di − cTi x ≥ 0
⇒
xT Qi x ≤ −di − cTi x

Bi x
1

1
≤ (−di − cTi x + 1). (5)

T
2
2 (−di − ci x − 1)
Multiplying the linear term bj − aTj x ≥ 0 to both sides of the above SOC yields
the following valid inequality,

Bi x
1
(bj − aTj x)

T T

1 (1 + di + cT x)
≤ 2 (bj − aj x)(1 − di − ci x).
2 i
Linearization of the above inequality is the following SOC-RLT constraint,

Bi (bj x − Xaj )

(−cTi Xaj + (bj cTi − di aTj − aTj )x + (1 + di )bj )
(6)
2
≤ 12 (cTi Xaj + (di aTj − aTj − bj cTi )x + (1 − di )bj ), i ∈ C, j = 1, . . . , m.
Enhancing (SDPRLT ) with the SOC-RLT constraints gives rise to a tighter relax-
ation:
(SDPSOC-RLT ) min Q0 · X + cT0 x
s.t. (1), (2), (3), (4), (6).
Recently, Burer and Yang [9] demonstrated that the SDP+RLT+(SOC-RLT)
relaxation has no gap in an extended trust region problem of minimizing a
quadratic function subject to a unit ball and multiple linear constraints, where
the linear constraints do not intersect with each other in the interior of the ball.
Stimulated by the construction of SOC-RLT constraints, the authors in [12]
derive the GSRT constraints from nonconvex quadratic constraints and linear
constraints. They first decompose each indefinite matrix in quadratic constraints
according to the signs of its eigenvalues, i.e., Qi = LTi Li − MiT Mi , i ∈ N ,
where Li is corresponding to the positive eigenvalues and Mi is corresponding
to the negative eigenvalues. One of such decompositions is the spectral decom-
n−p+r
position, Qi = j=1 λij vij viTj , where λi1 ≥ λi2 · · · λir > 0 > λip+1 ≥ · · · ≥

λin , 0 ≤ r ≤ p < n, and correspondingly Li = ( λi1 vi1 , . . . , λir vir )T , Mi =

( −λip+1 vip+1 , −λin vin ). They introduced an augmented zi to reform prob-
lem (P) as follows,
(RP) min xT Q0 x + cT0 x
s.t. xT Qi x + cTi x + di ≤ 0, i = 1, . . . , l,

1 T Li x
≤ zi , i ∈ N , (7)

2 (ci x + di + 1)

1 T Mi x
= zi , i ∈ N , (8)

2 (ci x + di − 1)
aTj x ≤ bj , j = 1, . . . , m.

X S x
Denote = (xT z T ). We then relax the intractable nonconvex
ST Z z

X S x T T X S x
constraint = (x z ) to (xT z T ), which is
ST Z z ST Z z
equivalent to the following LMI by the Schur complement,
⎛ ⎞
1 xT z T
⎝ x X S ⎠ 0.
z ST Z

Multiplying bj − aTj x and

Li x, 12 (cTi x + di + 1)
≤ zi , we further get

1 T Li x(bj − aj x) T
≤ zi (bj − aTj x),

(c x + d i + 1)(b j − a x)
2 i j

Li bj x − Li xxT aj
i.e.,

T

1 (cTi (bj x − xxT aj ) + (di + 1)(bj − aTj x))
≤ zi bj − zi x aj .
2
Then the linearization of the above formula gives rise to

Li bj x − Li Xaj

1 T
≤ zi bj − S·,i
T
aj . (9)

(c (b j x − Xa j ) + (d i + 1)(b j − aT
x))
2 i j
Since the equality constraint (8) is nonconvex and intractable, relaxing (8) to
inequality yields the following tractable SOC constraint,

1 T Mi x
≤ zi . (10)

2 (ci x + di − 1)
Similarly, linearizing the product of (10) and bj − aTj x gives rise to the following
valid inequalities

Mi bj x − Mi Xaj

1 T
≤ zi bj − S·,i
T
aj . (11)

T
2 (ci (bj x − Xaj ) + (di − 1)(bj − aj x))
We also linearize the quadratic form of (8),

2

1 T Mi x
= zi2 ,

(c x + d i − 1)
2 i
to a tractable linearization,
1
Zi−k,i−k = X · MiT Mi + (ci cTi · X + (di − 1)2 + 2cTi x(di − 1)). (12)
4
Finally, (7), (9), (10), (11) and (12) together make up the GSRT constraints.
With the GSRT constraint, we strengthen (SDPRLT ) to the following tighter
relaxation:
(SDPGSRT-A ) min Q0 · X + cT0 x
s.t. (1), (2), (4), (6), (7), (9), (10), (11), (12)
⎛ ⎞
1 xT z T
⎝ x X S ⎠ 0.
z ST Z
The following theorem, which shows the relationship among all the above
convex relaxations, is obvious due to the nested inclusion relationship of the
feasible regions for this sequence of the relaxations.
Theorem 1. v(P) ≥ v(SDPGSRT-A ) ≥ v(SDPSOC-RLT ) ≥ v(SDPRLT ) ≥
v(SDP).
A natural extension of GSRT is to apply a similar idea to linearize the prod-
uct of a pair of SOC constraints. From the above paragraph, we see that SOC
constraints can be generated by both convex and nonconvex (by adding an aug-
mented variable zi ) constraints, denoting by
C i x + ξ i ≤ li (x, z), (13)
where li (x, z) is a linear function of x and z. Multiplying two SOC constraints

yields the valid inequality,

s T t T

C xx (C ) + C s x(ξ t )T + ξ s xT (C t )T + ξ s (ξ t )T
≤ ls lt . (14)
F
Linearizing (14) yields the following constraint, termed SOC-SOC-RLT (SST)

constraint,

s

C X(C t )T + C s x(ξ t )T + ξ s xT (C t )T + ξ s (ξ t )T
≤ βs,t ,
F
where βs,t (X, S, Z) = (ζ s )T Xζ t + (ζ s )T Sη t + (ζ t )T Sη s + (η s )T Zη t + (θs ζ t +

θt ζ s )T x + (θs η t + θt η s )T z + θs θt is a linear function of variables s, z, X, S, Z,
which is linearized from ls (x, z)lt (x, z).
We next review a recently proposed valid inequality, the KSOC valid inequal-
ities, by linearizing the Kronecker products of semidefinite matrices derived from
valid SOC constraints, which was first proposed in the recent work [2]. Anstre-
icher [2] introduced a new kind of constraint with an RLT-like technique for the
well-known CDT problem [10],
min xT Bx + bT x
s.t. x ≤ 1,
Ax + c ≤ 1,
where B is an n × n symmetric matrix and A is an m × n matrix with full

row rank. Using the Schur complement, the two quadratic constraints can be
reformulated to the following LMIs,

I x I Ax + c
0 and 0. (15)
xT 1 (Ax + c)T 1
Anstreicher [2] proposed a valid LMI by linearizing the Kronecker product of

the above two matrices, because the Kronecker product of any two positive
semidefinite matrices is positive semidefinite. A drawback of direct linearizing
the Kronecker product is that the matrix dimension, n2 × n2 , is too high for
computation. To reduce the large dimension of the Kronecker matrix, he further

proposed KSOC cuts to handle the problem of dimensionality. This idea can in
fact be extended to the following two semidefinite matrices,

ls (x, z)Ip hs (x) lt (x, z)Iq ht (x)
and ,
(hs (x))T ls (x, z) (ht (x))T lt (x, z)
which are derived from (and equivalent to) GSOC constraints in (13) by Schur
complement, where hj (x) = C j x + ξ j , j = s, t. Linearizing the above Kronecker
product yields KSOC valid inequalities.
3 Concluding Remark
In this survey paper, we have reviewed various valid inequalities to tighten the
SDP relaxations for nonconvex QCQP problems. In fact, we can further rewrite
the objective function as min τ and add a new constraint xT0 Q0 x0 + cT0 x ≤ τ ,
with a new variable τ . The original problem is then equivalent to minimizing
τ and all the techniques developed in this paper can be applied to the new
constraint xT0 Q0 x0 + c0 T x ≤ τ to achieve a tighter lower bound. A drawback
convex relaxations in this paper is their large number of valid inequalities, which
prevent them from efficient computation. A future direction should be to inves-
tigate how to find out the valid inequalities that are violated most and add them
dynamically to solve the original problem.
References
1. Anstreicher, K.: Semidefinite programming versus the reformulation-linearization
technique for nonconvex quadratically constrained quadratic programming. J.
Global Optim. 43(2–3), 471–484 (2009)
2. Anstreicher, K.: Kronecker product constraints with an application to the two-
trust-region subproblem. SIAM J. Optim. 27(1), 368–378 (2017)
3. Anstreicher, K., Chen, X., Wolkowicz, H., Yuan, Y.X.: Strong duality for a trust-
region type relaxation of the quadratic assignment problem. Linear Algebr. Its
Appl. 301(1–3), 121–136 (1999)
4. Anstreicher, K., Wolkowicz, H.: On lagrangian relaxation of quadratic matrix con-
straints. SIAM J. Matrix Anal. Appl. 22(1), 41–55 (2000)
5. Beck, A., Eldar, Y.C.: Strong duality in nonconvex quadratic optimization with
two quadratic constraints. SIAM J. Optim. 17(3), 844–860 (2006)
6. Burer, S., Anstreicher, K.: Second-order-cone constraints for extended trust-region
subproblems. SIAM J. Optim. 23(1), 432–451 (2013)
7. Burer, S., Saxena, A.: The MILP road to MIQCP. In: Mixed Integer Nonlinear
Programming, pp. 373–405. Springer (2012)
8. Burer, S., Vandenbussche, D.: A finite branch-and-bound algorithm for nonconvex
quadratic programming via semidefinite relaxations. Math. Program. 113(2), 259–
282 (2008)
9. Burer, S., Yang, B.: The trust region subproblem with non-intersecting linear con-
straints. Math. Program. 149(1–2), 253–264 (2013)
10. Celis, M., Dennis, J., Tapia, R.: A trust region strategy for nonlinear equality
constrained optimization. Numer. Optim. 1984, 71–82 (1985)
11. Goemans, M.X., Williamson, D.P.: Improved approximation algorithms for max-
imum cut and satisfiability problems using semidefinite programming. J. ACM
(JACM) 42(6), 1115–1145 (1995)
12. Jiang, R., Li, D.: Convex relaxations with second order cone constraints for non-
convex quadratically constrained quadratic programming (2016)
13. Linderoth, J.: A simplicial branch-and-bound algorithm for solving quadratically
constrained quadratic programs. Math. Program. 103(2), 251–282 (2005)
14. Pardalos, P.M., Vavasis, S.A.: Quadratic programming with one negative eigen-
value is NP-hard. J. Global Optim. 1(1), 15–22 (1991)
15. Sherali, H.D., Adams, W.P.: A reformulation-linearization technique for solving
discrete and continuous nonconvex problems, vol. 31. Springer Science & Business
Media (2013)
16. Shor, N.Z.: Quadratic optimization problems. Sov. J. Comput. Syst. Sci. 25(6),
1–11 (1987)
17. Sturm, J.F., Zhang, S.: On cones of nonnegative quadratic functions. Math. Oper.
Res 28(2), 246–267 (2003)
18. Vavasis, S.A.: Quadratic programming is in NP. Inf. Process. Lett. 36(2), 73–77
(1990)
245–267 (2003)
Solving a Type of the Tikhonov
Regularization of the Total Least Squares
by a New S-Lemma
Huu-Quang Nguyen1,2 , Ruey-Lin Sheu2(B) , and Yong Xia3

1
Department of Mathematics, Vinh University, Vinh, Vietnam
quangdhv@gmail.com
2
Department of Mathematics, National Cheng Kung University, Tainan, Taiwan
rsheu@mail.ncku.edu.tw
3
State Key Laboratory of Software Development Environment
School of Mathematics and System Sciences, Beihang University, Beijing, China
yxia@buaa.edu.cn
Abstract. We present a new S-lemma with two quadratic equalities and

use it to minimize a special type of polynomials of degree 4. As a result,
by the Dinkelbach approach with 2 SDP’s (semidefinite programming),
the minimum value and the minimum solution to the Tikhonov regu-
larization of the total least squares problem with L = I can be nicely
obtained.
Keywords: S-lemma with equality · Tikhonov regularization · Total

least squares · Dinkelbach method
1 Introduction
The well-known S-lemma due to Yakubovich [15] is a fundamental tool in control

theory, optimization and robust analysis. Given two quadratic functions f (x) =
xT P x+2pT x+p0 and g(x) = xT Qx+2q T x+q0 having symmetric matrices P and
Q, the S-lemma asserts that, if g(x) ≤ 0 satisfies Slater’s condition (i.e., g(x̄) < 0
for some x̄), the following two statements are always equivalent (S1 ) ∼ (S2 ) :
(S1 ) (∀x ∈ Rn ) g(x) ≤ 0 =⇒ f (x) ≥ 0.

(S2 ) There exists a λ ≥ 0 such that f (x) + λg(x) ≥ 0, ∀x ∈ Rn .
The S-lemma can be extended to deal with the equality g(x) = 0 along a
series approaches, for example, please see [2,5,6,14]. They try to answer, for
what
pairs of (f (x), g(x)), the following two statements can become equivalent
(E1 ) ∼ (E2 ) :
(E1 ) (∀x ∈ Rn ) g(x) = 0 =⇒ f (x) ≥ 0.

(E2 ) There exists a λ ∈ R such that f (x) + λg(x) ≥ 0, ∀x ∈ Rn .

https://doi.org/10.1007/978-3-030-21803-4_23
222 H.-Q. Nguyen et al.
The complete necessary and sufficient conditions for the pair of quadratic
functions (f (x), g(x)) under which (E1 ) ∼ (E2 ) were established by Xia et. al.
[13] with new applications to both quadratic optimization and the convexity of
the joint numerical range. As a further extension, Wang and Xia [12] established
the so-called S-lemma with interval bounds:
(I1 ) (∀x ∈ Rn ) (u0 ≤ g(x) ≤ v0 =⇒ f (x) ≥ 0);

(I2 ) There exists a λ ∈ R such that f (x) + λ+ (g(x) − v0 ) + λ− (g(x) − u0 ) ≥
0, ∀x ∈ Rn .
It has direct applications in the extended trust-region subproblem [7,11].

More importantly, it helps to guarantee the strong Lagrangian duality under the
most mild assumptions [12]. Other extensions, such as the one by Polyak [8],
focused on the system of three homogeneous quadratic forms. More discussions
can be found in the survey paper [3].
m
In this paper, weobtain a new variant of the S-lemma. Given a, b, c ∈ R
θ θ
and Θ = ΘT = Θ = 1 2 ∈ R2×2 , θ ∈ R2 , γ ∈ R, this new version asks, when
θ2 θ3

the following two statements can become equivalent (G1 ) ∼ (G2 ) :
(G1 ) (∀x ∈ Rn , z = (z1 , z2 )T ∈ R2 )
f (x) − z1 = 0, g(x) − z2 = 0, z1 a + z2 b ≤ c =⇒ z T Θz + θT z − γ ≥ 0;
(G2 ) There exist α, β ∈ R and μ ∈ Rm

+ such that, ∀(z, x) ∈ R
n+2
,
z T Θz + θT z − γ + α(f (x) − z1 ) + β(g(x) − z2 ) + μT (z1 a + z2 b − c) ≥ 0.
Our main result is a sufficient condition for (G1 ) ∼ (G2 ) as follows:

Theorem 1. Under the following assumptions
∃ζ, η ∈ R : ζP + ηQ 0, (1)

θ1 θ2
Θ=
0, (2)
θ2 θ3
there is (G1 ) ∼ (G2 ).

Our most interest in this paper is to apply Theorem 1 to optimize a special
class of polynomials of degree 4
min G(x) = θ1 f (x)2 + 2θ2 f (x)g(x) + θ3 g(x)2 + θ4 f (x) + θ5 g(x) (PoD4)

x∈Rn
under the condition (2). Then, use the result from minimizing (PoD4) to solve a
type of the Tikhonov regularization of the total least squares (TRTLS) proposed
by Beck and Ben-Tal in [1]. The purpose of resolving (TRTLS) is to stabilize,
via the Tikhonov regularization, the total least square solution for fitting an
overdetermined linear system Ax = b. It was formulated in [1] as follows. Given
Solving a Type of the Tikhonov Regularization 223
the regularization matrix L ∈ Rk×n and ρ > 0 is a penalty parameter, consider

the following problem
min {E2 + r2 + ρLx2 : (A + E)x = b + r} (TRTLS)

E,r,x
where E ∈ Rm×n , r and x ∈ Rn . Then, (TRTLS) can be transformed to the

following sum-of-ratios problem:
min {E2 + r2 + ρLx2 : (A + E)x = b + r}
E,r,x

= min min{E2 + r2 + ρLx2 : (A + E)x = b + r}
x E,r
||Ax − b||2
= minn + ρ||Lx||2 (3)
x∈R ||x||2 + 1
For L = I, Beck and Ben-Tal in [1] then used the Dinkelbach method [4] incorpo-
rating with the bisection search method to solve (3). We show, in Sect. 3, that (3)
can be resolved by solving two SDP’s, with one SDP to obtain its optimal value
and the other one for the optimal solution. There is no need for any bisection
method.
The remainder of this study is organized as follows: In Sect. 2, we provide the
proof for Theorem 1 and solve Problem (PoD4). In Sect. 3, we use the Dinkelbach
method incorporating two SDP’s to solve (TRTLS) for the case L = I. Finally,
we have a short discussion in Sect. 4 for future extensions.
2 Proof for the New Version of the S-Lemma

The proof was done by using an important result by Polyak [8] that, under
Condition (1), the joint numerical range (f (x), g(x)) is a convex subset in R2 .
Proof. (G1 ) =⇒ (G2 ): By a result in [8, Theorem 2.2], the set
D1 = {(z1 , z2 ) | f (x) − z1 = 0, g(x) − z2 = 0, x ∈ Rn } ⊂ R2 , (4)
is convex. Let
D2 = {(z1 , z2 )| z1 a + z2 b ≤ c}. (5)
and it is easy to see that D2 ⊂ R2 is also convex. Then, the statement (G1 ) can
be recast as
(z1 , z2 ) ∈ D1 ∩ D2 ⇒ F (z) − γ = (z T Θz + θT z − γ) ≥ 0.
Equivalently, it means that (D1 ∩ D2 ) ∩ {(z1 , z2 ) | Fγ (z1 , z2 ) < 0} = ∅. Due

to Condition (2) that Θ
0, the set {(z1 , z2 ) | F (z1 , z2 ) − γ < 0} is convex.
Therefore, there exist ᾱ, β̄ such that {(z1 , z2 ) | ᾱz1 + β̄z2 + γ̄ = 0} separates
D1 ∩D2 and {(z1 , z2 ) | F (z1 , z2 )−γ < 0}. Without loss the generality, we assume
that
ᾱz1 + β̄z2 + γ̄ ≥ 0, ∀ (z1 , z2 ) ∈ D1 ∩ D2 , (6)
ᾱz1 + β̄z2 + γ̄ < 0, ∀ (z1 , z2 ) ∈ {(z1 , z2 ) | F (z1 , z2 ) − γ < 0}. (7)

From (7), it implies that
ᾱz1 + β̄z2 + γ̄ ≥ 0 ⇒ F (z1 , z2 ) − γ ≥ 0.
By S-lemma, there exists t ≥ 0 such that
F (z1 , z2 ) − γ − t(ᾱz1 + β̄z2 + γ̄) ≥ 0, ∀ (z1 , z2 ) ∈ R2 . (8)
If t = 0, then with α = β = 0, μ = 0, (G2 ) holds.
If t > 0, by (6), the system
tᾱz1 + tβ̄z2 + tγ̄ < 0,
z1 a + z2 b − c ≤ 0,
(z1 , z2 ) ∈ D1
is not solvable. By Farkas theorem (see [9, Theorem 21.1], [10, Sect. 6.10 21.1],
[6, Theorem 2.1]), there exists μ ∈ Rm
+ such that
tᾱz1 + tβ̄z2 + tγ̄ + μT (z1 a + z2 b − c) ≥ 0, ∀ (z1 , z2 ) ∈ D1 .

Therefore, we have
tᾱf (x) + tβ̄g(x) + tγ̄ + μT (f (x)a + g(x)b − c) ≥ 0, ∀ x ∈ Rn . (9)
Let α = μT a + tᾱ, β = μT b + tβ̄. Then,
(9) ⇔ (μT a + tᾱ)f (x) + (μT b + tβ̄)g(x) + tγ̄ − μT c ≥ 0
⇔ αf (x) + βg(x) + (μT a + tᾱ − α)z1 + (μT b + tβ̄ − β)z2 + tγ̄ − μT c ≥ 0
⇔ α(f (x) − z1 ) + β(g(x) − z2 ) + μT (z1 a + z2 b − c) ≥ −tᾱz1 − tβ̄z2 − tγ̄. (10)
Combining (8) and (10), we get (G2 ).

(G2 ) =⇒ (G1 ): It is trivial.
2.1 Optimizing a Class of Polynomials of Degree 4 (PoD4)

Applying Theorem 1, we can now solve the problem (PoD4) by solving the
SDP (11) below under the assumption that f, g satisfy Condition (1) whereas
θ1 , θ2 , θ3 ∈ R satisfy condition (2).
min G(x) = θ1 f (x)2 + 2θ2 f (x)g(x) + θ3 g(x)2 + θ4 f (x) + θ5 g(x)
x∈Rn
= min F (z1 , z2 )
{f (x)=z1 , g(x)=z2 }

= max γ : {(z1 , z2 , x)| f (x) = z1 , g(x) = z2 , F (z1 , z2 ) − γ < 0} = ∅

= max γ : {(f (x) = z1 , g(x) = z2 )} ⇒ {F (z1 , z2 ) − γ ≥ 0}

= max γ : F (z1 , z2 ) − γ + α(f (x) − z1 ) + β(g(x) − z2 ) ≥ 0
γ, α, β∈R
⎧ ⎛ θ4 −α ⎞ ⎫
⎪
⎪ θ1 θ2 ⎪
⎪
⎨ ⎜ θ2 θ
[0] 2
θ5 −β ⎟ ⎬
γ : ⎜ ⎟
0 (11)
3 2
= max ⎝
γ, α, β∈R ⎪
⎪ [0]T αP + βQ αp + βq ⎠ ⎪
⎪
⎩ θ4 −α θ5 −β T T ⎭
2 2
αp + βq αp 0 + βq 0 − γ
3 Dinkelbach Method for Solving (TRTLI)

It is interesting to see that problem (PoD4) allows us to solve the total least
squares with Tikhonov identical regularization problem (see [1,16]) via solving
two SDPs. Let us consider the following sum-of-quadratic-ratios problem.
θ1 f (x)2 + θ4 f (x) + θ
minn + θ3 g(x) + 2θ2 f (x)
x∈R g(x) + γ
θ1 f (x)2 + 2θ2 f (x)g(x) + θ3 g(x)2 + (θ4 + 2γθ2 )f (x) + γθ3 g(x) + θ
= minn
x∈R g(x) + γ
h(x)
= minn (12)
x∈R l(x)
where f, g are quadratic functions, θ1 , θ2 , θ3 ∈ R satisfy the following condition:

θ1 θ2
Matrix
0, Q 0 and γ > 0. (C 2)
θ2 θ3
In fact, the problem (12) covers the problem (TRTLSI) in [1,16] as a special
case. With γ = 1, θ = 0, θ1 = 0, θ2 = 0, θ3 = ρ, θ4 = 1, f (x) = Ax + b2 , g(x) =
x2 then (12) reduces to (TRTLSI).
Notice that the form (12) is a single-ratio h(x)/l(x) fractional programming
problem. It can be solved by the well-known Dinkelbach method [4]. To this end,
define
π(t) = minn {h(x) − tl(x)}
x∈R
= minn {θ1 f (x)2 + 2θ2 f (x)g(x) + θ3 g(x)2

x∈R
+ (θ4 + 2γθ2 )f (x) + (γθ3 − t)g(x) + θ − tγ}.
It has been proved in [4] that π(t) is strictly decreasing and
h(x)
min { } = t∗ if and only if min {h(x) − t∗ l(x)} = π(t∗ ) = 0 (13)
x∈Rn l(x) x∈Rn
Since π(t) is strictly decreasing, then we conclude that t∗ is maximum of all

t such that π(t) ≥ 0. Then, we can recast (12) to become
t∗ = max{t : π(t) ≥ 0} = max {t : minn (h(x) − tl(x) ≥ 0)}
t∈R t∈R x∈R
= max {t : h(x) − tl(x) ≥ 0, ∀x ∈ Rn }
t∈R
= max {t : θ1 f (x)2 + 2θ2 f (x)g(x) + θ3 g(x)2 +

t∈R
+(θ4 + 2γθ2 )f (x) + (γθ3 − t)g(x) + θ − tγ ≥ 0, ∀x ∈ Rn }
= max {t : θ1 z12 + 2θ2 z1 z2 + θ3 z22 + (θ4 + 2γθ2 )z1 +
t∈R
+(γθ3 − t)z2 + θ − tγ ≥ 0, (z1 = f (x), z2 = g(x))}

= max t : θ1 z12 + 2θ2 z1 z2 + θ3 z22 + (θ4 + 2θ2 )z
t, α, β∈R

+(γθ3 − t)z2 + θ − tγ + α(f (x) − z1 ) + β(g(x) − z2 ) ≥ 0 (14)
where the last equation (14) is due to Theorem 1 by re-defining the notations as
θ4 + 2γθ2 := θ4 , γθ3 − t := θ5 , θ − tγ := −γ. Moreover, we can write (14) as the
following SDP:
⎛ θ4 +2γθ2 −α ⎞
θ1 θ2 2
⎜ [0] γθ3 −t−β ⎟
⎜ θ2 θ3 ⎟
0,
t∗ = ⎝
2 (15)
[0] αP + βQ αp + βq ⎠
θ4 +2γθ2 −α γθ3 −t−β
2 2
αpT + βq T ξ
where ξ = αp0 + βq0 + θ − tγ. In other words, the optimal value t∗ of (12),
and thus the optimal value of the problem (TRTLSI), can be computed through
solving the SDP (15).
After getting the optimal value t∗ of (12) from (15), by (13), we can find the
corresponding optimal solution x∗ by solving the following problem
min {h(x) − t∗ l(x)} (16)

x∈Rn
where h(x) − t∗ l(x) = θ1 f (x)2 + 2θ2 f (x)g(x) + θ3 g(x)2 + (θ4 + 2γθ2 )f (x) + (γθ3 −
t∗ )g(x) + θ − t∗ γ. Since (16) is a special form of (PoD4), therefore we are able
to get x∗ by solving another SDP similar to (11).
4 Discussion
In this paper, we propose a set of sufficient conditions (1)–(2) under which
(G1 ) ∼ (G2 ). It can be easily verified that, when m = 1, a = 1, b = c = θ1 =
. . . = θ4 = γ = 0, θ5 = 1, (G1 ) ∼ (G2 ) reduces to (S1 ) ∼ (S2 ) and we get the
classical S-lemma. Similarly, (G1 ) ∼ (G2 ) covers (I1 ) ∼ (I2 ) with m = 2, a =
(1, −1)T , b = (0, 0)T , c = (v0 , −u0 )T , θ1 = θ2 = θ3 = θ4 = γ = 0 and θ5 = 1.
Moreover, if we further have u0 = v0 = 0, (G1 ) ∼ (G2 ) becomes (E1 ) ∼ (E2 ).
In other words, if the sufficient conditions (1)–(2) van be removed, (G1 ) ∼ (G2 )
would be the most general results summarizing all previous results on S-lemma
so far.
References
1. Beck, A., Ben-Tal, A.: On the solution of the Tikhonov regularization of the total
least squares problem. SIAM J. Optim. 17(1), 98–118 (2006)
2. Beck, A., Eldar, Y.C.: Strong duality in nonconvex quadratic optimization with
two quadratic constraint. SIAM J. Optim. 17(3), 844–860 (2006)
3. Derinkuyu, K., Pınar, M.Ç.: On the S-procedure and some variants. Math. Methods
Oper. Res. 64(1), 55–77 (2006)
4. Dinkelbach, W.: On nonlinear fractional programming. Manag. Sci. 13, 492–498
(1967)
5. Nguyen, V.B., Sheu, R.L., Xia, Y.: An SDP approach for quadratic fractional
problems with a two-sided quadratic constraint. Optim. Methods Softw. 31(4),
701–719 (2016)
6. Polik, I., Terlaky, T.: A survey of the S-lemma. SIAM Rev. 49(3), 371–418 (2007)
7. Pong, T.K., Wolkowicz, H.: The generalized trust region subprobelm. Comput.
Optim. Appl. 58, 273–322 (2014)
8. Polyak, B.T.: Convexity of quadratic transformations and its use in control and
optimization. J. Optim. Theory Appl. 99(3), 553–583 (1998)
9. Rockefellar, R.T.: Convex Analysis. Princeton University Press (1970)
10. Stoer, J., Witzgall, C.: Convexity and Optimization in Finite Dimensions, vol. I.
Springer-Verlag, Heidelberg (1970)
11. Stern, R., Wolkowicz, H.: Indefinite trust region subproblems and nonsymmetric
eigenvalue perturbations. SIAM J. Optim. 5(2), 286–313 (1995)
12. Wang, S., Xia, Y.: Strong duality for generalized trust region subproblem: S-lemma
with interval bounds. Optim. Lett. 9(6), 1063–1073 (2015)
13. Xia, Y., Wang, S., Sheu, R.L.: S-lemma with equality and its applications. Math.
Program. Ser. A. 156(1), 513–547 (2016)
14. Tuy, H., Tuan, H.D.: Generalized S-lemma and strong duality in nonconvex
quadratic programming. J. Global Optim. 56, 1045–1072 (2013)
15. Yakubovich, V.A.: S-procedure in nonlinear control theory. Vestn. Leningr. Univ.
1, 62–77 (1971). (in Russian)
16. Yang, M., Yong, X., Wang, J., Peng, J.: Efficiently solving total least squares with
Tikhonov identical regularization. Comput. Optim. Appl. 70(2), 571–592 (2018)
Solving Mathematical Programs with
Complementarity Constraints with a
Penalization Approach
Lina Abdallah1(B) , Tangi Migot2 , and Mounir Haddou3

1
Lebanese University, Tripoli, Lebanon
lina abdallah@hotmail.fr
2
University of Guelph, Guelph, ON, Canada
tmigot@uogelph.ca
3
INSA-IRMAR, UMR-CNRS 6625, Rennes, France
mounir.haddou@insa-rennes.fr
Abstract. In this paper, we consider mathematical problems with com-

plementarity constraints. To solve it, we propose a penalization approach
based on concave and nondecreasing functions. We give the link between
the penalized problem and our original problem. This approach was
already used in [3]. The main difference is that, we do not use any con-
straint qualification assumption. Some numerical results are presented
to show the validity of this approach.
Keywords: Constrained optimization · Nonlinear programming ·

Mpcc · Penalty function
1 Introduction
The Mathematical Program with Equilibrium Constraints (MPEC) is a con-
strained optimization problem in which the constraints include equilibrium con-
straints, such as variational inequalities or complementarity conditions. In this
paper, we consider a special case of MPEC, the Mathematical Program with
Complementarity Constraints (MPCC) in which the equilibrium constraints are
complementarity constraints. MPCC is an important class of problems since they
arise frequently in applications in engineering design, in economic equilibrium
and in multilevel games [18]. One main source of MPCC comes from bilevel
programming problems, which have numerous applications in practice [27].
A way to solve a standard nonlinear programming problem is to solve its
Karush-Kuhn-Tucker (KKT) system by using some numerical methods such as
Newton-type methods. However, the classical MFCQ that is very often used
to guarantee convergence of algorithms is violated at any feasible point when
the MPCC is treated as a standard nonlinear programming problem, a local
minimizer of MPCC may not be a solution of the classical KKT system. This is
partly due to the geometry of the complementarity constraint that always has
an empty relative interior.
https://doi.org/10.1007/978-3-030-21803-4_24
Solving Mathematical Programs with Complementarity Constraints 229
A wide range of numerical methods have been proposed to solve this prob-
lem such as relaxation methods [5,6], interior-point methods [16,20,25], penalty
methods [10,18,24], SQP methods [8], dc methods [23], filter methods [15] and
Levenberg-Marquardt methods [14].
In this study, following [3], we study a penalization method to solve the
MPCC. We regularize the complementarity constraints by using concave and
nondecreasing functions introduced in [9], and then penalize the constraints.
This approach allows us to consider the regularization parameter as a variable of
the problem. We prove that every cluster point of the KKT points of the penalty
problem gives a local minimum for the MPCC. We improve the result from [3]
by proving a convergence theorem without any constraint qualification, thus
removing a restrictive assumption. Numerical tests on some randomly generated
problems are studied to show the efficiency and robustness of this approach.
This paper is organized as follows. In Sect. 2, we present some preliminar-
ies on the smoothing functions and our formulation problem. We present our
penalty method and gives the link between the penalized problem and the orig-
inal problem in Sect. 3. The last section presents a set of numerical experiments
concerning a simple number partitioning problem.
2 Preliminaries
In this section, we present some preliminaries concerning the regularization and
approximation process. We consider the following problem
⎧ ∗
⎨ f = min f (x, y)
(P ) < x.y > = 0
⎩
(x, y) ∈ D
where f : IR2n → IR is continuously differentiable, D = [0, v]2n and < . >

denotes the inner product on IRn . We made this choice for D only to simplify
the exposition, we can consider any bounded set D.
Now, we reformulate this problem by using a smoothing technique. This
technique has been studied in the context of complementarity problems [1,2]
and uses a family of non-decreasing, continuously differentiable and concave θ
that satisfies
θ(0) = 0, and ∀t > 0, lim θ(t) = 1
t→+∞
One generic way to build such functions is to consider non-increasing probability

density functions f : IR+ → IR+ and then take the corresponding cumulative
distribution functions
t
∀t ≥ 0, θ(t) = f (x)dx.
0
By definition of f
+∞ 0
lim θ(t) = f (x)dx = 1 and θ(0) = f (x)dx = 0.
t→+∞ 0 0
230 L. Abdallah et al.
The hypothesis on f gives the concavity of θ.

We introduce θε (t) := θ( εt ) for ε > 0. This definition is similar to the per-
spective functions in convex analysis. These functions satisfy
θε (0) = 0, and ∀t > 0, lim θε (t) = 1.

t0
t
Some interesting examples of this family for t ≥ 0 are: θε1 (t) = , θ2 (t) =
t+ε ε
t log(1 + t)
1 − e− ε or θε3 (t) = .
log(1 + t + ε)
m
Lemma 1. ∀x ∈ [0, v], ∀ε ∈ (0, ε0 ], there exists m > 0 such that |∂ε θε (x)| ≤ .
ε2

Proof. Since θε (x) := θ( xε ), then ∂ε θε (x) = −x x
ε2 θε ( ε ). Now by the concavity of

θ, for x ≥ 0 we have 0 ≤ θε ( ε ) ≤ θε (0). Then − εm2 ≤ − εx2 θε ( xε ) ≤ 0 with
x

m = xθε (0).
Using θε functions, we regularize each complementarity constraint by con-

sidering
xi yi = 0, by θε (xi ) + θε (yi ) ≤ 1 (i = 1, . . . , n).
We first transform the inequality constraints into equality constraints, by intro-
ducing some slack variables e:
xi yi = 0, by (Gε (x, y, e))i := θε (xi ) + θε (yi ) + ei − 1 = 0.
The formulation of our approach can be written as follows

⎧
⎨ min f (x, y)

(Pε ) Gε (x, y, e) = 0
⎩
(x, y, e) ∈ D
where D = [0, v]2n × [0, 1]n . The limit problem (Pε ) for ε = 0 is noted (P).

Moreover, (P) is equivalent to (P ), see [2, Lemma 2.1].
3 A Penalization Approach
In this section, we consider a penalization of (Pε ) through the following penal-

ization function fσ : D × [0, ε] → IR
⎧
⎪
⎨ f (x, y) if ε = Δ(z, ε) = 0
1
fσ (z, ε) := f (x, y) + Δ(z, ε) + σβ(ε) if ε > 0
⎪
⎩ 2ε
+∞ if ε = 0 and Δ(z, ε) = 0
where z := (x, y, e) and the feasibility violation Δ is defined by
Δ(z, ε) := Gε (z) 2 .
The function β : [0, ε] → [0, ∞) is continuously differentiable on (0, ε] and β(0) =

0 ( ε is fixed). This function was introduced in [11] in the smooth case.
Remark 1. ∀z ∈ D , Δ(z, 0) = 0 ⇔ z is feasible for P ⇔ z is feasible for (P ).

Then, we consider the following penalized problem:

min fσ (z, ε)
(Pσ )
(z, ε) ∈ D × [0, ε]
The term σβ(ε) allows to consider ε as a new optimization variable, and minimize
simultaneously z and ε.
Let us now to recall the definition of Mangasarian-Fromovitz (MFCQ).
Definition 1. [21] We say that the Mangasarian-Fromovitz condition (MFCQ)

for (Pε ) holds at z ∈ D if Gε (z) has full rank and there exists a vector p ∈ IR3n

such that Gε (z)p = 0, where

> 0 if zi = 0
pi (1)
< 0 if zi = wi
with
v if i ∈ {1 . . . 2n}
wi =
1 if i ∈ {2n + 1 . . . 3n}
The following lemma proves that MFCQ is satisfied whenever ε > 0. This is a
great improvement with respect to [3] where this was a crucial assumption.
Lemma 2. Let ∀ε > 0. Any z feasible for (Pε ) verifies MFCQ.

Proof. (i) Let z be a feasible point of (Pε ). The matrix Gε (z) ∈ IRn × IR3n
defined as follows
⎛ ⎞
θε (x1 ) 0 . . . 0 θε (y1 ) 0 . . . 0 1 0 . . . 0
⎜ 0 θ (x2 ) 0 . . . 0 θε (y2 ) 0 . . . 0 1 . . . 0 ⎟

⎜ ε ⎟
Gε (z) = ⎜ . .. .. .. .. .. ⎟
⎝ .. . . . . .⎠

0 . . . 0 θε (xn ) 0 . . . 0 θε (yn ) 0 . . . 0 1
is of full rank.

(ii) We have to proves that there exits p ∈ IR3n such that Gε (z)p = 0 and pi
verifies (1).

Let p = (p1 , . . . , p3n ). Gε (z)p = 0 implies that

θε (xi )pi + θε (yi )pn+i + p2n+i = 0, for i = 1, . . . , n. (2)
The equality constraints Gε (z) = 0 gives θε (xj ) + θε (yj ) + ej − 1 = 0, for

j = 1, . . . , n. So, we consider three cases:
1. ej = 1,
2. ej = 0,
3. 0 < ej < 1.
(1) For the first case, ej = 1, so xj = yj = 0, since θε (0) = 0.
Replacing in (2), we obtain

θε (0)(pj + pn+j ) = −p2n+j .

We can take pj = pn+j = 1 > 0 and p2n+j = −2θε (0) < 0 (θε (0) > 0). So,
MFCQ is verified in this case.
(2) For the second case, ej = 0, we have to consider xj = 0, yj = 0, xj = 0, yj = 0
or xj = 0, yj = 0 since (θε (xj ) = 0 when xj = 0). “no” means that we don’t
have any constraints on pj or pn+j .

(i) Taking pj = −1, pn+j = −1, in (2), we obtain: p2n+j = θε (v) + θε (v) > 0

(Since θε (v) > 0), and, MFCQ is verified.
(2i) There is no constraints on pj , pn+j , only p2n+j should be positive. So,
MFCQ is verified.

θε (v) + 1
(3i) Taking pj = −1, p2n+j = 1, in (2), we have: pn+j = − . So, MFCQ
θε (yj )
is verified.
θ (0) + 1
(4i) Taking pj = 1, p2n+j = 1, in (2), we get: pn+j = − ε < 0. So,
θε (v)
MFCQ is verified.
(5i,6i,7i,8i) As above, it’s easy to see that the MFCQ is verified.
(3) In the third case, 0 < ej < 1, we can consider the same cases as (2), but
additionally here there is no constraints on p2n+j .
For all this cases, the condition MFCQ is verified.

The following theorem yields a condition to find a solution for (Pσ ). It also proves
a direct link to (P ) (Table 1).
Theorem 1. We suppose that

β (ε) ≥ β1 > 0 for 0 < ε < ε.
Let (z k , εk ) be a KKT point of Pσk corresponding to σ = σk with σk ↑ ∞ as

k → ∞ and let (z ∗ , ε∗ ) be a cluster point of {(z k , εk )} with finite fσ (z ∗ , ε∗ ).
Then ε∗ = 0 and (x∗ , y ∗ ) is a local minimum of the MPCC problem.
Proof. Let (z, ε) be a Kuhn Tucker point of Pσ with ε > 0, then there exist λ
and μ ∈ IR3n+1 such that:
(i) ∇fσ (z, ε) = λ − μ

(ii) min(λi , zi ) = min(μi , wi − zi ) = 0, i = 1 . . . 3n (3)
(iii) λ3n+1 = min(μ3n+1 , ε − ε) = 0,
Table 1. Case ej = 0
xj yj pj pn+j p2n+j
(i) v v <0 <0 >0
(ii) 0 < xj < v 0 < yj < v no no >0
(3i) xj = v 0 < yj < v <0 no >0
(4i) 0 yj = v >0 <0 >0
(5i) 0 < xj < v yj = v no <0 >0
(6i) v 0 <0 >0 >0
(7i) 0 < xj < v 0 no >0 >0
(8i) 0 0 < yj < v >0 no >0
where ∇fσ is the gradient of fσ with respect to (z, ε). Let (zk , εk ) be a sequence
of KKT points of Pσk with εk = 0, ∀k and lim σk = +∞.
k→+∞

Since D is compact, it holds (up to a subsequence) that
lim εk = ε∗ and lim zk = z∗ .

k→+∞ k→+∞
(3.i) yields to ∂ε fσk (z k , εk ) = −μ3n+1 ≤ 0. Then, if we denote Δk = Δ(z k , εk ),

we have
1 1
∂ε fσk = − 2 Δk + ∂ε Δk + σk β (εk ).
2εk 2εk
Multiplying by 2ε3k , and using ∂ε fσk ≤ 0, we obtain

ε2k ∂ε Δk + 2ε3k σk β (εk ) ≤ εk Δk .

Then β (ε) ≥ β1 > 0, yields
ε2k ∂ε Δk + 2ε3k σk β1 ≤ εk Δk .
Since Δk , θε and ε2 ∂ε θε are bounded (by Lemma 1), σk → ∞ when k → ∞, we

have ε∗ = 0.
Let V be a neighborhood of (z ∗ , 0), for any z feasible for P such that (z, 0) ∈ V
we have
fσ (z ∗ , 0) ≤ fσ (z, 0) = f (x, y) < +∞ (4)
as Δ(z, 0) = 0.
Since fσ (z ∗ , 0) is finite, it follows that Δ(z ∗ , 0) = 0. So, < x∗ , y ∗ > = 0, and
therefore, (x∗ , y ∗ ) is a feasible point of (P ).
Thus, (4) gives f (x∗ , y ∗ ) = fσ (z ∗ , 0) ≤ fσ (z, 0) = f (x, y). Therefore, (x∗ , y ∗ )
is a local minimum of the MPCC problem.
4 Numerical Results
Thanks to Theorem 1 and driving, in a smart way, σ to infinity, we also improved
the numerical results with respect to [3]. We consider some generated partitioning
problems, that can be cast as an MPCC. These simulations have been done using
AMPL language [4], with the the SNOPT √ solver [13]. In all our tests, we use the
same function β defined by β(ε) := ε [11].
Partitioning problem We describe now the formulation of this partitioning
problem. We consider a set of numbers S = {s1 , s2 , s3 , . . . , sn }. The goal is to
divide S into two subsets such that the subset sums are as close to each other as
n
possible. Let xj = 1 if sj is assigned to subset 1, 0 otherwise. Let sum1 = sj xj
j=1
n
n

and sum2 = sj − sj xj . The difference of the sums is
j=1 j=1
n
n

diff = sum2 − sum1 = c − 2 sj xj , (c = sj ).
j=1 j=1
We minimize the square of diff

⎛ ⎞2
n

diff2 = ⎝c − 2 sj xj ⎠ .
j=1
Note that diff2 can be written as follows
diff2 = c2 + 4xT Qx,
where
qii = si (si − c), qij = si sj .
Dropping the additive and multiplicative constants, we obtain the following opti-
mization problem
min xT Qx
(U QP )
x ∈ {0, 1}n
This formulation can be written as an MPCC problem

min xT Qx
(U QP )
x.(1 − x) = 0
To get some local solutions for (U QP ), we use our approach described in the
previous section. We generated various randomly problems of size
(n = 25, 50, 100, 150, 200, 250, 300),
with the elements drawn randomly from the interval (50,100).

Table 2 summarizes the different informations concerning the computational

effort of the SNOPT solver, by using respectively θ1 , θ2 and θ3 functions, we fix
ε = 1. For each problem, we used 100 different initial points generated randomly
from the interval [0, 1].

100
– Best Sum Diff: the best value of | (Q ∗ round(x[i]) − 0.5 ∗ c|,
i=1
– Integrality measure: the max |round(xi ) − xi |,
i
– nb: the number of tests such that the best sum is satisfied,
– nb10 : the number of tests such that the sum:

100

(Q ∗ round(x[i]) − 0.5 ∗ c ≤ 10.

i=1
Table 2. Results on Partitioning Problem using (θ1 , θ2 , θ3 )
n Best Sum Diff nb nb10 Integrality measure

25 (0, 0, 0) (6, 6, 4) (25, 28, 22) (0, 0, 0)
50 (0.5, 0.5, 0.5) (0, 0, 0) (18, 23, 13) (0, 9.190e − 23, 1.810e − 10)
100 (0.5, 0.5, 0.5) (0, 0, 0) (19, 23, 23) (0, 0, 0)
150 (0, 0, 0) (1, 1, 2) (13, 9, 10) (0, 0, 0)
200 (0.5, 0.5, 0.5) (0, 0, 0) (21, 23, 17) (1.640e − 16, 4.070e − 11, 0)
250 (0, 0, 0.5) (1, 4, 0) (22, 26, 22) (1.122e − 7, 2.220e − 16, 0)
300 (0, 0, 0) (4, 4, 2) (17, 19, 22) (0, 0, 1.312e − 9)
With our formulation, we obtain local solutions for (U QP ). We remark that

by considering the three functions θ1 , θ2 or θ3 we obtain the same result in almost
all the considered test problems. The obtained results for random problems val-
idate our approach.
5 Conclusion
In this paper, we present a penalty approach to solve the mathematical program
with complementarity constraints. Under some mild hypotheses and without any
constraint qualification assumption, we prove the link between the penalized
problem and the MPCC. Our approach tested on some randomly generated
partitioning problems give very promising results and validated our approach.
References
1. Abdallah, L., Haddou, M., Migot, T.: A sub-additive dc approach to the comple-
mentarity problem. Comput. Optim. Appl. 1–26 (2019)
2. Abdallah, L., Haddou, M., Migot, T.: Solving absolute value equation using comple-
mentarity and smoothing functions. J. Comput. Appl. Math. 327, 196–207 (2018)
3. Abdallah, L., Haddou, M.: An exact penalty approach for mathematical programs
with equilibrium constraints. J. Adv. Math. 9, 2946–2955 (2014)
4. AMPL. http://www.ampl.com
5. Dussault, J.P., Haddou, M., Kadrani, A., Migot, T.: How to Compute a M-
Stationary Point of the MPCC. Optimization online.org (2017)
6. Dussault, J.P., Haddou, M., Migot, T.: The New Butterfly Relaxation Method
for Mathematical Programs with Complementarity Constraints. Optimization
online.org (2016)
7. Facchinei, F., Pang, J.-S.: Finite-Dimensional Variational Inequalities and Com-
plementarity Problems, vol. I and II. Springer, New York (2003)
8. Fletcher, R., Leyffer, S., Ralph, D., Scholtes, S.: Local convergence of SQP methods
for mathematical programs with equilibrium constraints. SIAM J. Optim. 17(1),
259–286 (2006)
9. Haddou, M.: A new class of smoothing methods for mathematical programs with
equilibrium constraints. Pac. J. Optim. 5, 87–95 (2009)
10. Hu, X.M., Ralph, D.: Convergence of a penalty method for mathematical program-
ming with complementarity constraints. J. Optim. Theory Appl. 123, 365–390
(2004)
11. Huyer, W., Neumaier, A.: A new exact penalty function. SIAM J. Optim. 13(4),
1141–1158 (2003)
12. Facchinei, F., Jiang, H., Qi, L.: A smoothing method for mathematical programs
with equilibrium constraints. Math. Program. 85, 81–106 (1995)
13. Gill, P., Murray, W., Saunders, M.: SNOPT, a large-scale smooth optimization
problems having linear or nonlinear objectives and constraints. http://www-neos.
mcs.anl.gov/neos/solvers
14. Guo, L., Lin, G.H., Ye, J.J.: Solving mathematical programs with equilibrium
constraints. J. Optim. Theory Appl. 166, 234–256 (2015)
15. Leyfer, S., Munson, T.S.: A globally convergent filter method for MPECs, April
2009. ANL/MCS-P1457-0907
16. Leyfer, S., Lepez-Calva, G., Nocedal, J.: Interior methods for mathematical pro-
grams with complementarity constraints. SIAM J. Optim. 17(1), 52–77 (2006)
17. Lin, G.H., Fukushima, M.: Some exact penalty results for nonlinear programs and
mathematical programs with equilibrum constraints. J. Optim. Theory Appl. 118,
67–80 (2003)
18. Luo, Z.Q., Pang, J.S., Ralph, D.: Mathematical Programs with Equilibrium Con-
straints. Cambridge University Press, Cambridge, UK (1996)
19. Liu, G., Ye, J., Zhu, J.: Partial exact penalty for mathematical programs with
equilibrum constraints. J. Set-Valued Anal. 16, 785–804 (2008)
20. Liu, X., Sun, J.: Generalized stationary points and an interior-point method for
mathematical programs with equilibrium constraints. Math. Program. 101(1),
231–261 (2004)
21. Mangasarian, O.L., Fromovitz, S.: The Fritz John necessary optimality conditions
in the presence of equality and inequality constraints. J. Math. Anal. Appl. 17,
37–47 (1967)
22. Mangasarian, O.L., Pang, J.S.: Exact penalty functions for mathematical programs
with linear complementary constraints. J. Glob. Optim. 5 (1994)
23. Marechal, M., Correa, R.: A DC (Difference of Convex functions) Approach of the
MPECs. Optimization online.org (2014)
24. Monteiro, M.T.T., Meira, J.F.P.: A penalty method and a regularization strategy
to solve MPCC. Int. J. Comput. Math. 88(1), 145–149 (2011)
25. Raghunathan, A.U., Biegler, L.T.: An interior point method for mathematical
programs with complementarity constraints (MPCCs). SIAM J. Optim. 15(3),
720–750 (2005)
26. Ralph, D., Wright, S.J.: Some Properties of Regularization and Penalization
Schemee for MPECS. Springer, New York (2000)
27. Ye, J.J., Zhu, D.L., Zhu, Q.J.: Exact penalization and necessary optimality condi-
tions for generalized bilevel programming problems. SIAM J. Optim. 7, 481–507
(1997)
Stochastic Tunneling for Improving the
Efficiency of Stochastic Efficient Global
Optimization
Fábio Nascentes1,2 , Rafael Holdorf Lopez1(B) , Rubens Sampaio3 ,

and Eduardo Souza de Cursi4
1
Center for Optimization and Reliability in Engineering (CORE), Universidade
Federal de Santa Catarina, Florianópolis 88037-000, Brazil
rafael.holdorf@ufsc.br
2
Departamento de Áreas Acadêmicas, Instituto Federal de Educação, Ciência e
Tecnologia de Goiás-IFG, Jataı́ 75804-714, Brazil
fabiotrabmat@gmail.com
3
Departamento de Engenharia Mecânica, PUC-Rio, Rio de Janeiro 22453-900, Brazil
4
Department Mecanique, Institut National des Sciences Appliquees (INSA) de
Rouen, Saint Etienne du Rouvray Cedex 76801, France
Abstract. This paper proposes the use of a normalization scheme for

increasing the performance of the recently developed Adaptive Target
Variance Stochastic Efficient Global Optimization (sEGO) method. Such
a method is designed for the minimization of functions that depend on
expensive to evaluate and high dimensional integrals. The results showed
that the use of the normalization in the sEGO method yielded very
promising results for the minimization of integrals. Indeed, it was able
to obtain more precise results, while requiring only a fraction of the
computational budget of the original version of the algorithm.
Keywords: Stochastic efficient global optimization ·

Stochastic tunneling · Global optimization · Robust design
1 Introduction
The optimization of a variety of engineering problems may require the minimiza-
tion (or maximization) of expensive to evaluate and high dimensional integrals.
These problems become more challenging if the resulting objective function turns
out be not non convex and multimodal. Examples of this kind may arise, for
example, from the maximization of the expected performance of a mechanical
system, vastly applied in robust design [10], the multidimensional integral of
Performance Based Design Optimization [2], or the double integral of Optimal
Design of Experiment problems [3].
A powerful approach to handle these issues is the Efficient Global Optimiza-
tion (EGO) [9], which exploits the information provided by the Kriging meta-
model to iteratively add new points, improving the surrogate accuracy and at
https://doi.org/10.1007/978-3-030-21803-4_25
Stochastic Tunneling for Improving the Efficiency 239
the same time seeking its global minimum. For problems presenting variability
(or uncertainty), the Stochastic Kriging (SK) [1] was developed. The use of SK
within the EGO framework, or stochastic Efficient Global Optimization (sEGO),
is relatively recent. For example, [11] benchmarked different infill criteria for the
noisy case, while [8] compared Kriging-based methods in heterogeneous noise
situations.
Recently, a Adaptive Variance Target sEGO [4] approach was proposed for
the minimization of integrals. It employs Monte Carlo Integration (MCI) to
approximate the objective function and includes the variance of the error in
the integration into the SK framework. This variance of the error is adaptively
managed by the method, providing an efficient optimization process by rationally
spending the available computational budget. This method reached promising
results, specially in high dimensional problems [4].
This paper, thus, aims at enhancing the performance of the Adaptive Vari-
ance Target sEGO [4] by proposing the use of a normalization scheme during the
optimization process. This normalization is the result of the so called stochas-
tic tunneling approach, applied together with the Simulated Annealing (SA) for
global Minimization of Complex Potential Energy Landscapes [12]. In the SA
context, the physical idea behind the stochastic tunneling method is to allow
the particle to “tunnel” high energy regions of domain, once it was realized that
they are not relevant for the low-energy properties of the problem. In the sEGO
context, it is expected that this normalization reduce the variability level of the
regions of the design domain that have high values of the objective function as
well as reduce the dependency of the quality of the search on the parameters of
the SK.
The rest of the paper is organized as follows: Sect. 2 presents the problem
statement. The Adaptive Variance Target sEGO is presented in Sect. 3, together
with the proposed normalization scheme. Numerical examples are studied in
Sect. 6 to show the efficiency and robustness of the normalization. Finally, the
main conclusion are listed in Sect. 7.
2 Problem Statement
The goal of this paper is to solve the problem of minimization of a function y,

which depends on an integral as in

min y(d) = φ(d, x)w(x)dx, (1)
d∈S Ω
where d ∈ n is the design vector, x ∈ nx is the parameter vector, φ : n ×

nx → is a known function, S is the design domain, w(x) is some known
weight function (e.g. probability distribution) and Ω ⊆ nx is the integration
domain (e.g. support of the probability distribution). We also assume here that
the design domain S considers only box constrains. Here, we are interested in
situations that: φ is a black box function and is computationally demanding,
240 F. Nascentes et al.
while the resulting objective function y is not convex and multimodal. Applying
MCI to estimate y, we have
1
nr
y(d) ≈ ȳ(d) = φ(d, x(i) ), (2)
nr i=1
where nr is the sample size and x(i) are sample points randomly drawn from
distribution w(x). One of the advantages of MCI is that we are able to estimate
the variance of the error of the approximation as:
1
nr
σ 2 (d) = (φi − ȳ(d))2 , (3)
nr (nr − 1) i=1
where φi = φ(d, x(i) ). Thus, by increasing the sample size nr (i.e. the number of
replications), the variance estimate decreases and approximation in 2 gets closer
to the exact value of Eq. 1.
3 The Adaptive Variance Target sEGO Approach

sEGO methods generally follow these steps:
1. Construction of the initial sampling plan;

2. Construction of the SK metamodel;
3. Addition of a new infill point to the sampling plan and return to step 2.
For the construction of the initial sampling plan, we employ here the Latin
hypercube scheme detailed in [5]. Then, Steps 2 and 3 are repeated until a stop
criterion is met, e.g., maximum number of function evaluations. The manner in
which the infill points are added in each iteration is what differs the different
sEGO approaches. In this paper, we employ the AEI criterion as infill point
strategy, since it provided promising results [11]. Step 2 constructs a prediction
model, which is given by the SK in this paper, and its formulation is given in
the next subsection.
3.1 Stochastic Kriging (SK)
The work of [1] proposed a SK accounting for the sampling variability that is
inherent to a stochastic simulation. To accomplish this, they characterized both
the intrinsic error inherent in a stochastic simulation and the extrinsic error that
comes from the metamodel approximation. Then, the SK prediction can be seen
as:
Trend Extrinsic Intrinsic

ŷ(di ) = M (di ) + Z(di ) + (di ) , (4)
where M (d) is the usual average trend, Z(d) accounts for the model uncertainty
and is now referred as extrinsic noise. The additional term , called intrinsic
noise, accounts for the simulation uncertainty or variability. In this paper, the
variability is due to the error in the approximation of the integral from Eq. (1)
caused by MCI. It is worth to recall here that MCI provides an estimation of
the variance of this error. That is, we are able to estimate the intrinsic noise,
and consequently, introduce this information into the metamodel framework. To
accomplish this, we construct the covariance matrix of the intrinsic noise - among
the current sampling plan points. Since the intrinsic error is assumed to be i.i.d.
and normal, the covariance matrix is a diagonal matrix with components
(Σ )ii = σ 2 (di ), i = 1, 2, ..., ns , (5)
where σ 2 is given by 3. Then, considering the Best Linear Unbiased Predictor

(BLUP) shown by [1], the prediction of the SK at a given point du is:
ŷ(du ) = μ̂ + rT (Ψ + Σ )−1 (y − 1μ̂), (6)
which is the usual Kriging prediction with the added diagonal correlation matrix
from the intrinsic noise. Similarly, the predicted error takes the form:

2 1 + λ(d) − rT (Ψ + Σ )−1 r
s2n (d) = σ

(1 − 1T (Ψ + Σ )−1 r)2 (7)

+ ,
1T (Ψ + Σ )−1 1
where λ(d) corresponds to the regression term.
4 Adaptive Target Selection
The adaptive target selection is briefly introduced in this section. For a more
detailed description, the reader is referred to [4]. With the framework presented
so far, we are able to incorporate error estimates from MCI within the sEGO
scheme. It is important to notice that the number of samples of the MCI is an
input parameter, i.e. the designer has to set nr in Eq. (3). Consequently, the
designer is able to control the magnitude of Σ and λ by changing the sample size
nr . However, in practice a target variance (σ 20 ) is first chosen and the sample size
is iteratively increased until the evaluated variance is close to the target value.
Thus, for a constant target variance, the regression parameter is then enforced
by the MCI procedure to be
λ(d) = σ 20 . (8)
The choice of the target variance must consider two facts: (a) if the target
variance is too high, the associated error may lead to a poor and deceiving
approximation of the integral, and, (b) if the target tends to zero, so does the
error and we retrieve the deterministic case, however, at the expense of a huge
computational effort.
The advantage here is that the Adaptive Variance Target selection automat-
ically defines the variance target for a new infill point in the sEGO search. That
is, the adaptive approach starts exploring the design domain by evaluating the
objective function value of each design point using MCI with a high target vari-
ance - so that each evaluation requires only a few samples. Then, it gradually
reduces the target variance for the evaluation of additional infill points in already
visited regions.
A flowchart of the proposed stochastic EGO algorithm, including the pro-
posed adaptive target selection, is shown in Fig. 1. In the next paragraphs, each
of its steps is detailed.
Fig. 1. Flowchart of the algorithm
After the construction of the SK metamodel for the initial sampling plan,
the infill stage begins. The AEI method is employed for this purpose. Here, an
initial target variance σ 20 is set and the first infill point is added to the model
being simulated up to this corresponding target variance.
From the second infill point on, the adaptive target selection scheme starts to
take place. We propose the use of an exponential decay equation parameterized
by problem dimension (n) and the number of points already sampled near the
new infill point (nclose ), which is defined by the number of points in the model
located at a given distance (rhc ) of the infill point. Here, we consider a hypercube
around the infill point selected with half-sides rhc to evaluate nclose .
Then, when the infill is located within an unsampled region, its target vari-
ance is set as the initial target variance. On the other hand, when the infill is
located in a region with existing sampled points, a lower target variance (σ 2adapt )
is employed for the approximation of its objective function value. This is done to
allocate more computational effort on regions that need to be exploited. When
they start to group up, the focus changes to landscape exploitation. In this sit-
uation, the target MCI variance is set to a lower value, increasing the model
accuracy.
The expression proposed to calculate the adaptive target value for each iter-
ation of the sEGO algorithm is
σ 20
σ 2adapt = , (9)
exp(a1 + a2 · n + a3 · nclose − a4 · nclose · n)
where ai are given constants. We also set a minimum and a maximum value for
the adaptive target, in order to avoid a computationally intractable number of
samples. We thus enforce
σ 2min ≤ σ 2adapt ≤ σ 20 , (10)
where σ 2min is a lower bound on the target.
5 The Proposed Normalization Scheme
The normalization employed in this paper is based on the stochastic tunneling

approach, which consists in allowing the particle to “tunnel” high energy regions
of domain, avoiding getting trapped on local minima, usually employed in the SA
algorithm. According to [12], it may be done by applying the following nonlinear
transformation to ȳ:
J(d) = 1 − exp (−γ (y(d) − y0 )) , (11)
where γ and y0 are given parameters. Then, the sEGO approaches minimizes
J instead of the original approximated function ȳ. In the sEGO context, it is
expected that this normalization reduce the variability level of the regions of the
design domain that have high values of the objective function, as well as reduce
the dependency of the quality of the search on the parameters of the SK and
adaptive methods.
6 Numerical Examples
In this section, we analyze the minimization of two multimodal problems taken
from [4]. The first problem is a stochastic version of the 2D Multimodal Branin
function:
φ(d, X) = p1 (d2 − p2 d21 + p3 d1 − p4 )2 X1 +

p5 (1 − p6 ) cos(d1 )X2 + p5 + 5d1 , (12)
with parameters p1 = 1, p2 = 5.1/(4π 2 ), p3 = 5/π, p4 = 6, p5 = 10, p6 =

1/(8π). The design domain is S = {d1 ∈ [−5, 10], d2 ∈ [0, 15]}. X1 and X2
are Normal random variables given by (X1 , X2 ) ∼ N (1, 0.05). The second is a
stochastic version of the 10 Multimodal Levy function:

n−1
φ(d, X) = sin2 (πp1 ) + (pi − 1)2 [1 + 10 sin2 (πpi + 1)]
i=1
+ (pn − 1) [1 + sin2 (2πpn )],
2
(13)
where pi = 1 + di4Xi for i = 1, 2, ..., n. Here we take n = 10 and a design domain

S = {di ∈ [−10, 10], i = 1, 2, ..., 10}. The random variables Xi follow a Normal
distribution with σX = 0.01, i.e., Xi ∼ N (1, 0.01). In both problems, the weight
function w from Eq. 1 is taken as the probability density function (PDF) fX (x)
of the random vector X.
We employ the framework described in Fig. 1 with and without the proposed
normalization approach. The following parameters are kept constant: initial sam-
pling plans comprised of ns = 4n points, rhc = 0.1, a1 = a2 = a3 = 1/2 and
a4 = 1/100 (9), σ 2min = 10−6 and σ 20 = 10−2 . For the proposed normalization,
we employed γ = 0.01, and y0 as the deterministic optimum of each problem,
y0 = −16.644022 and 0 for the 2D and 10D problems, respectively. For the AEI
infill criterion, α = 1.00 was used as suggested by [7].
The efficiency of each algorithm is measured by the number of function eval-
uations (NFE), i.e. the number of times the function φ is evaluated, which also
is employed as the stopping criterion. It is important to point out that the
optimization procedure presented depends on random quantities. Therefore, the
results obtained are not deterministic and may change when the algorithm is
run several times. For this reason, when dealing with stochastic algorithms, it
is appropriate to present statistical results over a number of algorithm runs [6].
Thus, for each problem, the average as well as the 5 and 95 percentiles of the
results found over the set of 20 independent runs are presented as box plots.
In order to highlight the increase in performance when the proposed nor-
malization scheme is applied, we assign different computational budgets (NFE
values as stopping criterion) for the case with and without normalization. That
is, for the 2D Multimodal problem, we set NFE = 100 as stopping criterion when
using the normalization, while NFE = 1000 for the case not employing such a
normalization. For the 10D Multimodal problem, we employed NFE = 200 and
1000 for the cases with and without the normalization, respectively (Fig. 2).
(a) Multimodal 2D (Branin): σX = 0.05
(b) Multimodal 10D (Levy): σX = 0.01
Fig. 2. Comparison between normalized and non-normalized solution for σ 20 = 0.01.
7 Conclusion
This paper proposed the use of a normalization scheme in order to increase the
performance of the recently developed Adaptive Target Variance sEGO method
for the minimization of functions that depend on expensive to evaluate and high
dimensional integrals. As in the original version of the method, the integral to
be minimized was approximated by MCI and the variance of the error in this
approximation was included into the SK framework. The AEI infill criterion was
employed to guide the addition of new points in the metamodel. The modification
proposed here was to minimize a normalized version of ȳ, by employing the
nonlinear transformation proposed by [12].
Overall, the use of the normalization in the sEGO method yielded very
promising results for the minimization of integrals. Indeed, it was able to obtain
more precise results, while requiring only a fraction of the computational bud-
get of the original version of the algorithm. However, since the results presented
here are preliminary, the use of the normalization deserves further investigation
in order to better assess its impact in the sEGO search in different situations
and problems.
Acknowledgements. The authors acknowledge the financial support and thank the
Brazilian research funding agencies CNPq and CAPES.
References
1. Ankenman, B., Nelson, B.L., Staum, J.: Stochastic kriging for simulation meta-
modeling. Oper. Res. 58(2), 371–382 (2010)
2. Beck, A.T., Kougioumtzoglou, I.A., dos Santos, K.R.M.: Optimal performance-
based design of non-linear stochastic dynamical rc structures subject to stationary
wind excitation. Eng. Struct. 78, 145–153 (2014)
3. Beck, J., Dia, B.M., Espath, L.F.R., Long, Q., Tempone, R.: Fast Bayesian exper-
imental design: laplace-based importance sampling for the expected information
gain. Comput. Methods Appl. Mech. Eng. 334, 523–553 (2018). https://doi.org/
10.1016/j.cma.2018.01.053
4. Carraro, F., Lopez, R.H., Miguel, L.F.F., Andre J.T.: Optimum design of planar
steel frames using the search group algorithm. Struct. Multidiscip. Optim. (2019)
(p. to appear)
5. Forrester, A., Sobester, A., Keane, A.: Engineering Design Via Surrogate Mod-
elling: A Practical Guide. Wiley, Chichester (2008)
6. Gomes, W.J., Beck, A.T., Lopez, R.H., Miguel, L.F.: A probabilistic metric for
comparing metaheuristic optimization algorithms. Struct. Saf. 70, 59–70 (2018)
7. Huang, D., Allen, T.T., Notz, W.I., Zeng, N.: Global optimization of stochastic
black-box systems via sequential kriging meta-models. J. Glob. Optim. 34(3), 441–
466 (2006)
8. Jalali, H., Nieuwenhuyse, I.V., Picheny, V.: Comparison of kriging-based algorithms
for simulation optimization with heterogeneous noise. Eur. J. Oper. Res. 261(1),
279–301 (2017). https://doi.org/10.1016/j.ejor.2017.01.035
9. Jones, D.R., Schonlau, M., William, J.: Efficient global optimization of expensive
black-box functions. J. Glob. Optim. 13, 455–492 (1998). https://doi.org/10.1023/
a:1008306431147
10. Lopez, R., Ritto, T., Sampaio, R., de Cursi, J.S.: A new algorithm for the robust
optimization of rotor-bearing systems. Eng. Optim. 46(8), 1123–1138 (2014).
https://doi.org/10.1080/0305215X.2013.819095
11. Picheny, V., Wagner, T., Ginsbourger, D.: A benchmark of kriging-based infill
criteria for noisy optimization. Struct. Multidiscip. Optim. 48(3), 607–626 (2013)
12. Wenzel, W., Hamacher, K.: Stochastic tunneling approach for global mini-
mization of complex potential energy landscapes. Phys. Rev. Lett. 82, 3003–
3007 (1999). https://doi.org/10.1103/PhysRevLett.82.3003, https://link.aps.org/
doi/10.1103/PhysRevLett.82.3003
The Bernstein Polynomials Based
Globally Optimal Nonlinear Model
Predictive Control
Bhagyesh V. Patil1 , Ashok Krishnan1,2(B) , Foo Y. S. Eddy2 , and Ahmed

Zidna3
1
Cambridge Centre for Advanced Research and Education in Singapore, Singapore,
Singapore
bhagyesh.patil@gmail.com, ashok004@e.ntu.edu.sg
2
School of Electrical and Electronic Engineering, Nanyang Technological University,
Nanyang, Singapore
eddyfoo@ntu.edu.sg
3
LGIPM, Université de Lorrain, Lorrain, France
ahmed.zidna@univ-lorraine.fr
Abstract. Nonlinear model predictive control (NMPC) has shown con-

siderable success in the control of a nonlinear systems due to its ability to
deal directly with nonlinear models. However, the inclusion of a nonlinear
model in the NMPC framework potentially results in a highly nonlinear
(usually ‘nonconvex’) optimization problem. This paper proposes a solu-
tion technique for such optimization problems. Specifically, this paper pro-
poses an improved Bernstein global optimization algorithm. The proposed
algorithm contains a Newton-based box trim operator which extends the
classical Newton method using the geometrical properties associated with
the Bernstein polynomial. This operator accelerates the convergence of
the Bernstein global optimization algorithm by discarding those regions
of the solution search space which do not contain any solution. The util-
ity of this improved Bernstein algorithm is demonstrated by simulating
an NMPC problem for tracking multiple setpoint changes in the reactor
temperature of a continuous stirred-tank reactor (CSTR) system. Fur-
thermore, the performance of the proposed algorithm is compared with
those of the previously reported Bernstein global optimization algorithm
and a conventional sequential-quadratic programming based sub-optimal
NMPC scheme implemented in MATLAB.
1 Introduction
The design of efficient controllers to extract the desired performance from phys-
ical engineering systems has been a well studied problem in control engineer-
ing [3]. This problem can be broadly split into the following two stages: (i)
1
Now with the John Deere Technology Centre, Magarpatta City, Pune, India (email:
PatilBhagyesh@JohnDeere.com). 2,3 The authors acknowledge funding support from
the NTU Start-Up Grant and the MOE Academic Research Fund Tier 1 Grant.
https://doi.org/10.1007/978-3-030-21803-4_26
248 B. V. Patil et al.
development of a mathematical model for the physical system under study; and
(ii) controller design based on the mathematical model developed in (i). In the
last decade, tremendous advancements have been made in the development of
optimization algorithms and computing platforms used to solve optimization
problems. Consequently, several controllers have been designed which utilize
advanced computational algorithms to solve complex optimization problems (see,
for instance [7,8,10], and the references therein).
In recent years, nonlinear model predictive control (NMPC) has emerged as
a promising advanced control methodology. In principle, MPC performs a cost
optimization subject to specific constraints on the system. The cost optimization
is performed repeatedly over a moving horizon window [13]. We note that the
following two issues need to be carefully considered while designing any NMPC
scheme:
(i) Can the nonlinear optimization procedure be completed until a convergence
criterion is satisfied to guarantee the optimality of the solution obtained?
(ii) Can (i) be achieved within the prescribed sampling time?
This work primarily addresses (i) which necessitates the development of an
advanced optimization procedure. In the literature, (i) has been addressed by
many researchers using various global optimization solution approaches. For
instance, the particle swarm optimization (PSO) approach was used in [4]. A
branch-and-bound approach was adopted to solve the NMPC optimization prob-
lem in [2]. Reference [9] extended the traditional branch-and-bound approach
with bound tightening techniques to locate the correct global optimum for the
NMPC optimization problem. Apart from these works, Patil et al. advocated the
use of Bernstein global optimization procedures for NMPC applications (see, for
instance, [1,11]). These optimization procedures are based on the Bernstein form
of polynomials [12] and use several attractive ‘geometrical’ properties associated
with the Bernstein form of polynomials.
This work a sequential improvement of the previous work reported in [11].
Specifically, [11] introduced a Bernstein branch-and-bound algorithm to solve the
nonlinear optimization problems encountered in an NMPC scheme. We note that
the algorithm presented in [11] is computationally expensive due to the numerous
branchings involved in a typical branch-and-bound framework. This motivates
the main contribution of this work wherein a tool is developed to accelerate
the solution search procedure for the Bernstein global optimization algorithm.
The developed tool speeds up the Bernstein global optimization algorithm by
trimming (discarding) those regions from the solution search space which cer-
tainly do not contain any solution. Due to the nature of its main function, the
developed tool is called a ‘box trim operator’.
2 NMPC Formulation
Consider a class of time-invariant continuous-time systems described using the
following nonlinear model:
ẋ = f (x, u), x(t0 ) = x0 (1)
The Bernstein Polynomials Based Globally Optimal NMPC 249
where x ∈ Rnx and u ∈ Rnu represent the vectors of the system states and the
control inputs respectively while f describes the nonlinear dynamic behavior
of the system. The NMPC of a discrete-time system involves the solution of a
nonlinear optimization problem at each sampling instant. Mathematically, the
NMPC problem formulation can be summarized as follows:
N
−1
min J = L (xk , uk ) (2a)
xk ,uk
k=0
subject to x0 = x̂0 (2b)
xk+1 = xk + Δt.f (xk , uk ) (2c)
c(xk , uk ) ≤ 0 (2d)
xmin
k ≤ xk ≤ xmax
k (2e)
umin
k ≤ uk ≤ umax
k (2f)
for k = 0, 1, . . . , N − 1 (2g)
where N represents the prediction horizon; x̂0 ∈ Rnx represents the initial states
of the system and xk ∈ Rnx and uk ∈ Rnu represent the system states and the
control inputs respectively at the kth sampling instant. The objective function
(J) in (2a) is defined by the stage cost L (.). The discretized nonlinear dynamics
in (2c) are formulated as a set of equality constraints and c(xk , uk ) represents
the nonlinear constraints arising from the operational requirements of the system
in (1). The system is subjected to the state and input constraints described by
(2e)–(2f).
3 Bernstein Polynomial Approach for Global

Optimization
This section briefly introduces some important notions and properties about the
Bernstein form of polynomials. Subsequently, these properties are used to present
a Newton-based box trim operator. Finally, the Bernstein form of polynomials
and the Newton-based box trim operator are used in a suitable branch-and-
bound framework to determine the global solutions of the nonlinear optimization
problems encountered in the NMPC scheme described in Sect. 2.
We can write an l-variate polynomial pf with real coefficients aI as follows:

pf (x) = aI xI , x ∈ Rl , (3)
I≤N
where N is the degree of pf . We transform (3) into the following Bernstein form
of polynomials to obtain the bounds for its range over an l-dimensional box x.

pb (x) = bI (x) BIN (x) , (4)
I≤N
where BIN (x) is the Ith Bernstein basis polynomial of degree N . BIN (x) is defined
as follows:
BIN (x) = Bin11 (x1 ) · · · Binll (xl ), x ∈ Rl (5)
For ij = 0, 1, . . . , nj , j = 1, 2, . . . , l ,

n nj (xj − xj )ij (xj − xj )nj −ij
Bijj (xj ) = , (6)
ij (xj − xj )nj
and bI (x) is an array of Bernstein coefficients which is computed as the weighted

sum of the coefficients aI in (3) on the box x.

I
J K
J
bI (x) = w(x) (inf x)K−J aK , I ≤ N. (7)
N J
J≤I K≤J
J
Note that all the Bernstein coefficients bI (x)I∈S form an array, wherein S =
{I : I ≤ N }. Furthermore, we define S0 as a special set comprising only the
vertex indices from S as shown below:
S0 = {(0, 0, . . . , 0), (n1 , 0, . . . , 0), (0, n2 , 0, . . . , 0), . . . , (n1 , n2 , . . . , nl )}.
Remark 3.1. The partial derivative of the polynomial pf in (3) with respect to
xr (1 ≤ r ≤ l) can be obtained from its Bernstein coefficients bI (x) using the
following relation [12]:
nr
pf,r (x) = [bIr,1 (x) − bI (x)]BNr,−1 ,I (x), 1 ≤ r ≤ l, x ∈ x. (8)
w(x)
I≤Nr,−1
where pf,r (x) contains an enclosure of the range of the derivative of the polyno-
mial pf on the box x.
3.1 Newton-Based Box Trim Operator

An NMPC scheme for the system described by (1) involves solving an opti-
mization problem of the form (2a)–(2g). Such an optimization problem has a
set of nonlinear constraints described by (2c)–(2d) and a bounded search space
described by (2e)–(2f). Apart from a feasible solution, the search space also con-
tains a set of values that do not satisfy (2c)–(2d) (i.e. inconsistent values). As
such, it is imperative to remove these inconsistent values. A box trim operator
can help in removing the set of inconsistent values which do not satisfy (2c)–(2d)
from the search space described by (2e)–(2f). This is achieved by applying the
Newton method to remove the leftmost and rightmost ‘quasi-zero’ points from
the search-space described by (2e)–(2f). It is worth noting that such a box trim
operator helps in narrowing the search-space, thereby speeding up the solution
search procedure in an optimization algorithm.
In principle, the classical Newton method is used to find successively better

approximations of the zeros of a real-valued function. Based on this, the Newton
method to find the zero(s) of a function of the form (2c) for the set of variables
y = (x, u) is described below:
F (y k )
y k+1 = y k − ,
F (y k )
where F is the evaluation of (2c) at y k , and F is the Jacobian of F . The interval

Newton method generalizes the aforementioned procedure to intervals [6, p. 169].
In this context, we note that the Bernstein range enclosure (Remark 3.2) is an
interval form [14]. Consequently, we extend the interval Newton method based
on the Bernstein range enclosure property to obtain the Newton-like box trim
operator shown below:

F(Irk )
k+1 k k
y r = yr Ir − , (9)
FI k
r
where y = (x, u) is an interval vector or box and subscript r indicates the

rth variable from the box y on which the box trim operator is to be applied.
Furthermore, Irk := y kr or y kr depending on the endpoint of the rth variable from
which the ‘quasi-zero’ points of (2c) need to be removed. F(Irk ) represents an
interval encompassing the minimum and maximum Bernstein coefficients from
an array bI (y) (computed using (7) at Irk ), FI k denotes an interval enclosure for
r
the derivative of (2c) on yrk obtained using Remark 3.1, and yrk+1 represents
the trimmed interval for the rth variable.
3.2 Bernstein Bound-trim-branch Global Optimization Algorithm

This algorithm exhaustively explores the search space of the nonlinear optimiza-
tion problem in (2a)–(2g) with the objective of finding the best solution among
all the available solutions. This is performed at each NMPC step. Specifically,
this algorithm involves the three main steps listed below:
– Bounding (2a), (2c) and (2d) using the Bernstein range enclosure property.
– Box trimming operation using (9) for (2e)–(2f) using the relations (2c)–(2d).
– Branching by subdividing the domains of the variables in (2e)–(2f).
These three steps give rise to the name Bernstein bound-trim-branch (BBTB)
global optimization algorithm. A pseudo-code listing the steps of the BBTB
algorithm is provided below.
Algorithm Bernstein bound-trim-branch: [f ∗ , y∗ ] = BBTB(f, gi , hj , y,

f , h )
Inputs: The cost function in (2a) as f , the equality constraints in (2c) as hj , the
inequality constraints in (2d) as gi , the initial search box comprising the system
states (xk ) and the control inputs (uk ) as y, the tolerance parameter f on the
global minimum of (2a), and the tolerance parameter h to which the equality
constraints in (2c) need to be satisfied.
Outputs: The global minimum cost value of (2a) as f ∗ and the global minimizers
for the states (xk ) and the control input profile (uk ) as y∗ .
BEGIN Algorithm
Relaxation step
• Compute the Bernstein coefficient matrices for f , gi , and hj on y as bI,f , bI,gi ,

and bI,hj respectively. Set f to the minimum Bernstein coefficient from bI,f .
Set the global minimum estimate as f = f.
• Construct L ← {(f, bI,f , bI,gi , bI,hj , y)}, Lsol ← {}.
Box Trimming step

• If L is empty, then go to the termination step. Else, sort the items of L in the
ascending order of f( each item in L is of the form: (f, bI,f , bI,gi , bI,hj , y) ).
• Pick the first item from L by removing its entry.
• Apply the box trim operator (see Sect. 3.1) to y using the constraints in (2c)–
(2d) and obtain the trimmed box y . Compute the new Bernstein coefficient
matrices for f , gi , and hj on y as bI,f , bI,gi , and bI,hj respectively.
• Set f to the minimum Bernstein coefficient from b and update the global
I,f
minimum estimate as f = f. Construct the item as (f, bI,f , bI,gi , bI,hj , y ).
Constraint Feasibility and Vertex step
• For the item (f, bI,f , bI,gi , bI,hj , y ), check the constraint feasibility as detailed
in Remark 3.2. If it is found to be quasi-feasible, go to the branching step.
• Check if the item (f, bI,f , bI,gi , bI,hj , y ) satisfies the vertex property. If ‘true’,

then update f = b vi and add this item to Lsol . Go to the box trimming step.
I,f

vi
(Note that bI,f are the vertex Bernstein coefficients obtained from bI,f using
the special set S0 ).
Branching step
• For the item (f, bI,f , bI,gi , bI,hj , y ), subdivide the box y into two subboxes
y1 and y2 .
• Compute the Bernstein coefficient matrices for f , gi , and hj on y1 and y2 .
Construct the two items as (fk , bI,f,k , bI,gi,k , bI,hj,k , yk ), k = 1, 2.
• Discard yk , k = 1, 2 for which min(bI,f,k ) > f . Enter (fk , bI,f,k , bI,gi,k ,

bI,hj,k , yk ) into L Here fk := min(bI,f,k ) . Go to the box trimming step.
Termination step
• Find the item in Lsol for which the first entry is equal to f . Denote that item
by If .
• In If , the first entry is the global minimum f ∗ while the last entry is the
global minimizer y∗ .
• Return the global solution (f ∗ , y∗ ).
END Algorithm
3.3 Simulation Results
In this section, the performance of the BBTB algorithm (Sect. 3.2) based NMPC
scheme is first compared with that of the conventional sequential-quadratic pro-
gramming based suboptimal NMPC scheme implemented in MATLAB [5]. These
two schemes are compared in terms of the optimality of the solutions obtained
while solving the nonlinear optimization problems encountered in the respective
schemes. Subsequently, the benefits derived from the use of the box trim operator
in the BBTB algorithm are assessed. Specifically, the computation time required
and the number of boxes processed by the BBTB algorithm based NMPC scheme
are compared with the computation time required and the number of boxes pro-
cessed by the previously reported Bernstein algorithm (BBBC) based NMPC
scheme from [11].
The following parameter values were chosen for the simulation studies in this
paper:
• Sampling time, Δt = 0.5 seconds, and prediction horizon, N = 7.

• Q = diag(1 0.01)T and R = 0.01 as weighting matrices.
• Initial conditions: CA = 0.2 mol/l, T = 370 K, and Tc = 300 K.
• Tolerance: f = zero = 0.001 in the BBTB algorithm.
Figure 1 shows the evolution of the system states from their initial values
(CA = 0.2, T = 370) for a series of setpoint changes. The closed-loop perfor-
mances of the BBTB based NMPC scheme and the suboptimal NMPC scheme
are compared. We observed that both the system states transitioned smoothly
to their new values for multiple setpoint changes when the CSTR system was
controlled using the BBTB algorithm based NMPC scheme. On the other hand,
some undershoot and overshoot (≈ 2 − 5%) was observed when the suboptimal
NMPC scheme was used to control the CSTR system. The settling time was sim-
ilar for both the NMPC schemes. Figure 2a illustrates the control action observed
when the CSTR system was controlled using the BBTB algorithm based NMPC
scheme and the suboptimal NMPC scheme. It is apparent that except for the first
few samples (≈ 0−20), the BBTB algorithm based NMPC scheme demonstrates
smooth control performance. The suboptimal NMPC scheme demonstrates a
slightly oscillating control action, particularly when the setpoint changes are
applied.
Figure 2b presents the computational times required per sample to solve

the nonlinear optimization problems encountered in the BBTB algorithm based
NMPC scheme and the BBBC algorithm based NMPC scheme. It is observed
that the BBTB algorithm takes 56% lesser time than the BBBC algorithm for
computation. This is because the box trim operator discards some regions from
the solution search space which do not contain the global solution during the
branch-and-bound process of the Bernstein algorithm. Overall, it aids in decreas-
ing the time required to locate the correct global solution. This is also evident
from Fig. 3a, which shows the number of boxes processed by the BBBC and
BBTB algorithms during the branch-and-bound process. The presence of the box
trim operator in the BBTB algorithm reduces the number of boxes processed
by an average of 98% when compared with the BBBC algorithm. It is worth
mentioning that the nonlinear optimization problem structure at each sampling
instant of an NMPC scheme for a CSTR system remains the same. This fact
is well-exploited by the BBTB algorithm which consistently processes a similar
number of boxes (i.e. 6) at each NMPC iteration. This clearly demonstrates
the efficacy of the box trim operator used in the BBTB algorithm. Figure 3b
plots the cost function values of the nonlinear optimization problems solved at
each sampling instant of the BBTB algorithm based NMPC scheme and the
suboptimal NMPC scheme. It is observed that the cost function values obtained
are nearly identical for both the NMPC schemes. However, it is observed that
at each setpoint change (introduced at samples 0, 50, 100, 150, and 200), the
BBTB algorithm based NMPC scheme returned a lower cost function value
when compared with the suboptimal NMPC scheme. This can be attributed to
the smoother transitions (visible in Figs. 1a, b and 2a) observed while using the
BBTB algorithm based NMPC scheme.
1 380
0.9 370 Global NMPC (Bernstein algorithm)

Reactor Temperature (T)
Sub−optimal NMPC (fmincon)

Concentration (C )
0.8 360
A
0.7 350
0.6 340
0.5 330
0.4 320
Global NMPC (Bernstein algorithm)
0.3 Sub−optimal NMPC (fmincon) 310
0.2 300
0 50 100 150 200 250 0 50 100 150 200 250
Samples Samples
(a) (b)
Fig. 1. Evolution of the states CA (a) and T (b) when the CSTR system is controlled
using the BBTB algorithm based NMPC scheme and the sequential-quadratic pro-
gramming based suboptimal NMPC scheme.
310 0.2
Coolant stream temperature (Tc)

BBBC (without the box trim operator)
0.18
305 BBTB (with the box trim operator)
0.16
300 0.14
Time (Sec)
0.12
295
0.1
290 0.08
0.06
285
Global NMPC (Bernstein algorithm) 0.04
Sub−optimal NMPC (fmincon)
280 0.02
0 50 100 150 200 250 0 50 100 150 200 250
Samples Samples
(a) (b)
Fig. 2. (a) Control input (Tc ) profile for the CSTR system controlled using the BBTB
algorithm based NMPC scheme and the sequential-quadratic programming based sub-
optimal NMPC scheme. (b) Comparison of the computation times needed for solving
a nonlinear optimization problem of the form (2a)–(2g) at each sampling instant using
the BBTB algorithm and BBBC algorithm based NMPC schemes. The sampling time
is 0.5s.
NLP values (at each NMPC iteration)

Boxes processed (per NMPC iteration)
500 9
BBBC (without the box trim operator) Global NMPC (Bernstein)
450 BBTB (with the box trim operator) 8 Sub−optimal NMPC (fmincon)
400 7
350
6
300 7
5
250
6
4
200
3 SP SP SP4 SP
2 5
150 1 SP3
5 2
100 0 100 200 250
50 1
0 0
0 50 100 150 200 250 0 50 100 150 200 250
Samples Samples
(a) (b)
Fig. 3. (a) Number of boxes processed during the branch-and-bound process of the
BBBC and BBTB algorithms. (b) Cost function values of the nonlinear optimization
problems of the form (2a)–(2g) solved at each sampling instant when the CSTR sys-
tem is controlled using the BBTB algorithm based NMPC scheme and the sequential-
quadratic programming based suboptimal NMPC scheme. SP1 , SP2 , SP3 , and SP4
show the samples at which the setpoint changes are implemented.
4 Conclusions
This work presented a global optimization algorithm based NMPC scheme for
nonlinear systems. We first discussed the necessity of using global optimization
algorithm based NMPC scheme. Subsequently, we proposed an improvement in
the Bernstein global optimization algorithm. The proposed improvement was a
Newton method based box trim operator which utilized some nice geometrical
properties associated with the Bernstein form of polynomials. Practically, this
operator quickened the computational times for the online nonlinear optimiza-
tion problems encountered in an NMPC scheme. The BBTB algorithm based
NMPC scheme was tested on a CSTR system to demonstrate its efficacy. The
results of the case studies performed on the CSTR system demonstrated the
superior control performance of the BBTB algorithm based NMPC scheme when
compared with a conventional sequential-quadratic programming based subop-
timal NMPC scheme. The case studies also showed that the performance of the
Bernstein global optimization algorithm based NMPC scheme can be improved
significantly in terms of the computation time by including the Newton based

box trim operator described in this paper. This was particularly found true when
compared against previously reported Bernstein algorithm based NMPC scheme
from the literature.
References
1. Patil, B.V., Bhartiya, S., Nataraj, P.S.V., Nandola, N.N.: Multiple-model based
predictive control of nonlinear hybrid systems based on global optimization using
the Bernstein polynomial approach. J. Process Control 22(2), 423–435 (2012)
2. Cizniar, M., Fikar, M., Latifi, M.A.: Design of constrained nonlinear model pre-
dictive control based on global optimisation. In: 18th European Symposium on
Computer Aided Process Engineering-ESCAPE 18, pp. 1–6 (2008)
3. Doyle, J.C., Francis, B.A., Tannenbaum, A.R.: Feedback Control Theory. Dover
Publications, USA (2009)
4. Germin Nisha, M., Pillai, G.N.: Nonlinear model predictive control with relevance
vector regression and particle swarm optimization. J. Control. Theory Appl. 11(4),
563–569 (2013)
5. Grüne, L., Pannek, J.: Nonlinear Model Predictive Control, pp. 43–66. Springer,
London (2011)
6. Hansen, E.R., Walster, G.W.: Global Optimization Using Interval Analysis, 2nd
edn. Marcel Dekker, New York (2005)
7. Inga J. Wolf, Marquardt, W.: Fact NMPC schemes for regulatory and economic
NMPC− A review. J. Process Control 44, 162–183 (2016)
8. Lenhart, S., Workman, J.T.: Optimal Control Applied to Biological Models. CRC
Press, USA (2007)
9. Long, C., Polisetty, P., Gatzke, E.: Nonlinear model predictive control using deter-
ministic global optimization. J. Process Control 16(6), 635–643 (2006)
˚
10. Aström, K.J., Wittenmark, B.: Computer-Controlled Systems: Theory and Design,
3rd edn. Dover Publications, USA (2011)
11. Patil, B.V., Maciejowski, J., Ling, K.V.: Nonlinear model predictive control based
on Bernstein global optimization with application to a nonlinear CSTR. In: IEEE
Proceedings of 15th Annual European Control Conference, pp. 471–476. Aalborg,
Denmark (2016)
12. Ratschek, H., Rokne, J.: New Computer Methods for Global Optimization. Ellis
Horwood Publishers, Chichester, England (1988)
13. Rawlings, J.B., Mayne, D.Q., Diehl, M.M.: Model Predictive Control: Theory,
Computation, and Design, 2nd edn. Nob Hill Publishing, USA (2017)
14. Stahl, V.: Interval methods for bounding the range of polynomials and solving
systems of nonlinear equations. Ph.D. thesis, Johannes Kepler University, Linz
(1995)
Towards the Biconjugate of Bivariate
Piecewise Quadratic Functions
Deepak Kumar and Yves Lucet(B)
University of British Columbia Okanagan, 3187, University Way,

Kelowna, BC V1V 1V7, Canada
yves.lucet@ubc.ca
https://people.ok.ubc.ca/ylucet/
Abstract. Computing the closed convex envelope or biconjugate is the

core operation that bridges the domain of nonconvex with convex anal-
ysis. We focus here on computing the conjugate of a bivariate piecewise
quadratic function defined over a polytope. First, we compute the convex
envelope of each piece, which is characterized by a polyhedral subdivision
such that over each member of the subdivision, it has a rational form
(square of a linear function over a linear function). Then we compute the
conjugate of all such rational functions. It is observed that the conjugate
has a parabolic subdivision such that over each member of its subdivi-
sion, it has a fractional form (linear function over square root of a linear
function). This computation of the conjugate is performed with a worst-
case linear time complexity algorithm. Our results are an important step
toward computing the conjugate of a piecewise quadratic function, and
further in obtaining explicit formulas for the convex envelope of piecewise
rational functions.
Keywords: Conjugate · Convex envelope ·

Piecewise quadratic function
1 Introduction
Computational convex analysis (CCA) focuses on creating efficient algorithms

to compute fundamental transforms arising in the field of convex analysis. Com-
puting the convex envelope or biconjugate is the core operation that bridges the
domain of nonconvex analysis with convex analysis. Development of most of the
algorithms in CCA began with the Fast Legendre Transform (FLT) in [5], which
was further developed in [6,18], and improved to the optimal linear worst-case
time complexity in [19] and then [10,20]. More complex operators were then
considered [3,4,22] (see [21] for a survey including a list of applications).
Piecewise Linear Quadratic (PLQ) functions (piecewise quadratic functions
over a polyhedral partition) are well-known in the field of convex analysis [24]
Supported by NSERC, CFI.

https://doi.org/10.1007/978-3-030-21803-4_27
258 D. Kumar and Y. Lucet
with the existence of linear time algorithms for various convex transforms [4,22].
Computing the full graph of the convex hull of univariate PLQ functions is
possible in optimal linear worst-case time complexity [9].
For a function f defined over a region P , the pointwise supremum of
all its convex underestimators is called the convex envelope and is denoted
convfP (x, y). Computing the convex envelope of a multilinear function over
a unit hypercube is NP-Hard [7]. However, the convex envelope of functions
defined over a polytope P and restricted by the vertices of P can be computed
in finite time using a linear program [26,27]. A method to reduce the computa-
tion of convex envelope of functions that are one lower dimension(Rn−1 ) convex
and have indefinite Hessian to optimization problems in lower dimensions is
discussed in [14].
Any general bivariate nonconvex quadratic function can be linearly trans-
formed to the sum of bilinear and a linear function. Convex envelopes for bilinear
functions over rectangles have been discussed in [23] and validated in [1]. The
convex envelope over special polytopes (not containing edges with finite positive
slope) was derived in [25] while [15] deals with bilinear functions over a triangle
containing exactly one edge with finite positive slope. The convex envelope over
general triangles and triangulation of the polytopes through doubly nonnegative
matrices (both semidefinite and nonnegative) is presented in [2].
In [16], it is shown that the analytical form of the convex envelope of some
bivariate functions defined over polytopes can be computed by solving a con-
tinuously differentiable convex problem. In that case, the convex envelope is
characterized by a polyhedral subdivision.
The Fenchel conjugate f ∗ (s) = supx∈Rn [s, x − f (x)] (we note s, x = sT x)
of a function f : Rn → R ∪ {+∞} is also known as the Legendre-Fenchel Trans-
form or convex conjugate or simply conjugate. It plays a significant role in duality
and computing it is a key step in solving the dual optimization problem [24].
Most notably, the biconjugate is also the closed convex envelope.
A method to compute the conjugate known as the fast Legendre transform
was introduced in [5] and studied in [6,18]. A linear time algorithm was later
introduced by Lucet to compute the discrete Legendre transform [19]. Those
algorithms are numeric and do not provide symbolic expressions.
Computation of the conjugate of convex univariate PLQ functions have been
well studied in the literature and linear time algorithms have been developed
in [8,11]. Recently, a linear time algorithm to compute the conjugate of convex
bivariate PLQ functions was proposed in [12].
Let f : Rn → R ∪ {+∞} be a piecewise function, i.e. f (x) = fi (x) if x ∈ Pi
for i = 1, . . . , N . From [13, Theorem 2.4.1], we have (inf i fi )∗ = supi fi∗ , and
from [13, Proposition 2.6.1], conv(inf i (fi + IPi )) = conv(inf i [conv(fi + IPi )])
where IPi is the indicator function for Pi . Hence, conv(inf i (fi + IPi )) =
(supi [conv(fi + IPi )]∗ )∗ . This provides an algorithm to compute the closed con-
vex envelope: (1) compute the convex envelope of each piece, (2) compute the
conjugate of the convex envelope of each piece, (3) compute the maximum of all
Towards the Biconjugate of Bivariate Piecewise Quadratic Functions 259
the conjugates, and (4) compute the conjugate of the function obtained in (3)
to obtain the biconjugate. The present work focuses on Step (2).
Recall that given a quadratic function over a polytope, the eigenvalues of
its symmetric matrix determine how difficult its convex envelope is to compute
(for computational purposes, we can ignore the affine part of the function). If
the matrix is semi-definite (positive or negative), the convex envelope is easily
computed. When it is indefinite, a change of coordinate reduces the problem
to finding the convex envelope of the function (x, y) → xy over a polytope, for
which step (1) is known [17].
The paper is organized as follow. Section 3 focuses on the domain of the
conjugate while Sect. 4 determines the symbolic expressions. Section 5 concludes
the paper with future work.
2 Preliminaries and Notations

The subdifferential ∂f (x) of a function f : Rn → R∪{+∞} at any x ∈ dom(f ) =
{x : f (x) < ∞} is ∂f (x) = {s : f (y) ≥ f (x) + s, y − x, ∀y ∈ dom(f )}
(∂f (x) = {∇f (x)} when f is differentiable at x). We note IP the indicator
function of the set P , i.e. IP (x) = 0 when x ∈ P and IP (x) = +∞ when x ∈ / P.
A parabola is a two dimensional planar curve whose points (x, y) satisfy the
equation ax2 + bxy + cy 2 + dx + ey + f = 0 with b2 − 4ac = 0. A parabolic
region is formed by the intersection of a finite number of parabolic inequalities,
i.e. Pr = {x ∈ R2 : Cpi (x) ≤ 0, i ∈ {1, · · · , k}} where Cpi (x) = ai x21 + bi x1 x2 +
ci x22 + di x1 + ei x2 + fi and b2i − 4ai ci = 0. The set Pri = {x ∈ R2 : Cpi (x) ≤ 0}
is convex, but Psi = {x ∈ R2 : Cpi (x) ≥ 0} is not.
A convex set R = i∈{1,...,m} Ri , R ⊆ R2 , defined as the union of a finite
number of parabolic regionsis said to have a parabolic subdivision if for any
j, k ∈ {1, · · · , m}, j = k, Rj Rk is either empty or is contained in a parabola.
3 The Domain of the Conjugate
Given a nonconvex PLQ function, we first compute the closed convex envelope
of each piece and obtain a piecewise rational function [17]. We now compute
the conjugate of such a rational function over a polytope by first computing its
domain, which will turn out to be a parabolic subdivision. Recall that for PLQ
functions, dom f ∗ = ∂f (dom f ). We decompose the polytope dom f = P into
its interior, its vertexes, and its edges.
Following [17], we write a rational function as
(ξ1 (x, y))2

r(x, y) = + ξ0 (x, y), (1)
ξ2 (x, y)
where ξi (x, y) are linear functions in x and y.

Proposition
1 (Interior). Consider r defined by (1), there exists αij such
that x∈dom(r) ∂r(x) = {s : Cr (s) = 0}, where Cr (s) = α11 s21 + α12 s1 s2 +
α22 s22 + α10 s1 + α02 s2 + α00 and {s : Cr (s) = 0} is a parabolic curve.
Proof. Note ξ1 (x) = ξ11 x1 + ξ12 x2 + ξ10 , ξ2 (x) = ξ21 x1 + ξ22 x2 + ξ20 and ξ0 (x) =
ξ01 x1 + ξ02 x2 + ξ00 . Since r is differentiable everywhere in dom(r) = R2 /{z :
ξ2 (z) = 0}, for any x ∈ dom(r) we compute s = ∇r(x) as si = 2ξ1i t − ξ2i t2 + ξ0i
for i = 1, 2, where t = (ξ11 x1 + ξ12 x2 + ξ10 )/(ξ21 x1 + ξ22 x2 + ξ20 ). Hence, s =
∇r(x) represents the parametric equation of a conic section, and by eliminating
t, we get Cr (s) = 0 where
Cr (s) = α11 s21 + α12 s1 s2 + α22 s22 + α10 s1 + α02 s2 + α00 ,
with α11 = ξ21 ξ22 , α12 = −2ξ21

2 2 3 4
ξ22 , α22 = ξ21 and other αij are functions of the
coefficients of r. We check that α12 −4α11 α22 = (−2ξ21
2 3
ξ22 )2 −4ξ21
6 2
ξ22 = 0, so the
conic section is a parabola. Consequently, for all x ∈ dom(r), ∂r(x) is contained
in the parabolic curve Cr (s) = 0, i.e.

∂r(x) ⊂ {s : Cr (s) = 0}.
x∈dom(r)
Conversely, any point sr that satisfies Cr (sr ) = 0, satisfies the parametric

equation as well, so the converse inclusion is true.
Corollary 1 (Interior). For a bivariate rational function r,and a polytope

P , define f (x) = r(x) + IP (x), then for all x ∈ int(P ), the set x∈int(P ) ∂f (x)
is contained inside a parabolic arc.

Proof. We have, x∈int(P ) ∂f (x) ⊆ x∈int(P ) ∂r(x) and x∈int(P ) ∂r(x) ⊂ P
where P ⊂ R2 is a parabolic curve (from Proposition 1). Since P is connected,
we obtain that x∈int(P ) ∂r(x) is contained in a parabolic arc.
Next we compute the subdifferential at any vertex in the smooth case (the
proof involves a straightforward computation of the normal cone).
Lemma 1 (Vertices). For g ∈ C 1 , P a polytope, and v vertex. Let f (x) =
g(x) + IP (x). Then ∂f (v) is an unbounded polyhedral set.
There is one vertex at which both numerator and denominator equal zero
although the rational function can be extended by continuity over the polytope;
we conjecture the result based on numerous observations.
Conjecture 1 (Vertex). Let r as in (1), f (x) = r(x) + IP (x) and v be a vertex

of P with ξ1 (v) = ξ2 (v) = 0. Then ∂f (v) is a parabolic region.
Lemma 2 (Edges). For g ∈ C 1 , a polytope P , and an edge E = {x : x2 =

c, xl1 ≤ x1 ≤ xu
mx1 + 1
} between vertices xl and xu , let f (x) = g(x) + IP (x),
then x∈ri(E) ∂f (x) = x∈ri(E) {s + ∇g(x) : s1 + ms2 = 0, s2 ≥ 0}.
Proof. For all x ∈ ri(E), ∂f (x) = ∂g(x) + NP (x). Let L(x) = x2 − mx1 − c be
the expression of the line joining xl and xu such that P ⊂ {x : L(x) ≤ 0}. (The
case P ⊂ {x : L(x) ≥ 0} is analogous.)
Since P ⊂ R2 is a polytope, for all x ∈ ri(E), NP (x) = {s : s = λ∇L(x), λ ≥
0} is the normal cone of P at x and can be written NP (x) = {s : s1 + ms2 =
0, s2 ≥ 0}. In the special case when E = {x : x1 = d, xl1 ≤ x1 ≤ xu1 },
L(x) = x1 − d and NP (x) = {s : s2 = 0, s1 ≥ 0}. Now for any x ∈ ri(E),
∂f (x) = ∂g(x) + NP (x) = {s + ∇g(x) : s1 + ms2 = 0, s2 ≥ 0}, so

∂f (x) = {s + ∇g(x) : s1 + ms2 = 0, s2 ≥ 0}.
x∈ri(E) x∈ri(E)
Proposition 2 (Edges). For r as in (1), a polytope P and an edge E =

− −
{x : x2 = mx1 + c, v1 ≤ x1 ≤ v1 } between vertices v and v , let f (x) =
+ +
r(x) + IP (x), then x∈ri(E) ∂f (x) is either a parabolic region or a ray.

Proof. From Corollary 1, there exists l, u ∈ R2 such that x∈ri(E) ∂r(x) =

x∈ri(E) {s : Cr (s) = 0, l1 ≤ s1 ≤ u1 }. So computing x∈ri(E) ∂f (x) leads to
the following two cases:
Case 1 (l = u) Same case as when r is quadratic (known result).
Case 2 (l = u) By setting g = r in Lemma 2, for any x ∈ ri(E), ∂f (x) = {s :
s1 + ms2 = 0, s2 ≥ 0}. Similar to the quadratic case, when ∇r(x) = l, ∂f (x) =
{s : s1 + ms2 − (l1 + ml2 ) = 0, s2 ≥ l2 } and when ∇r(x) = u, ∂f (x) = {s :
s1 + ms2 − (u1 + mu2 ) = 0, s2 ≥ u2 }. Assume ∂f (x) ⊂ {s : Cr (s) ≤ 0} (the case
∂f (x) ⊂ {s : Cr (s) ≥ 0} is analogous). Then

∂f (x) = {s + ∇r(x) : s1 + ms2 = 0, s2 ≥ 0}
x∈ri(E) x∈ri(E)
= {s : l1 + ml2 ≤ s1 + ms2 ≤ u1 + mu2 , Cr (s) ≤ 0}
is a parabolic region.
By gathering Lemma 1, Proposition 2, and Corollary 1, we obtain.

Theorem 1 (Parabolic domain). Assuming Conjecture 1 holds, r is as in
(1), P is a polytope, and f (x) = r(x) + IP (x). Then x∈P ∂f (x) has a parabolic
subdivision.
36x21 + 21x1 x2 + 36x22 − 81x1 + 24x2 − 252

Example 1. For r = and polytope
−12x1 + 9x2 + 75
P formed by vertices v1= (−1, 1), v2 = (−3, −3) and v3 = (−4, −3), let f (x) =
r(x) + IP (x). We have x∈dom(r) ∂r(x) = {s : C(s) = 0} where C(s) = 9s21 +
24s1 s2 − 234s1 + 16s22 + 200s2 − 527. The parabolic subdivision for this example
is shown in Fig. 1.
∂f(v1)
∂f(x)
⋃
x∈ri(E13)
∂f(x)
⋃
x∈ri(E12)
∂f(v3)
∂f(v2)
Fig. 1. Parabolic subdivision for r and P from Example 1
4 Conjugate Expressions
Now that we know dom f ∗ as a parabolic subdivision, we turn to the computa-
tion of its expression on each piece. We note
ψ (s , s )
gf (s1 , s2 ) = 1 1 2 + ψ0 (s1 , s2 ) (2)
ζ00 ψ1/2 (s1 , s2 )
gq (s1 , s2 ) = ζ11 s21 + ζ12 s1 s2 + ζ22 s22 + ζ10 s1 + ζ01 s2 + ζ00 (3)
gl (s1 , s2 ) = ζ10 s1 + ζ01 s2 + ζ00 (4)
where ψ0 , ψ1/2 and ψ1 are linear functions in s, and ζij ∈ R.

Theorem 2. Assume Conjecture 1 holds. For r as in (1), a polytope P , and
f (x) = r(x) + IP (x), the conjugate f ∗ (s) has a parabolic subdivision such that
over each member of its subdivision it has one of the forms in (2)–(4)
Proof. We compute the critical points for the optimization problem defining f ∗ .
Case 1 (Vertices) For any vertex v, f ∗ (s) = s1 v1 + s2 v2 − r(v) is a linear
function of form (4) defined over an unbounded polyhedral set (from Lemma 1).
In the special case, when ∂f (v) is a parabolic region (Conjecture 1), the conjugate
would again be a linear function but defined over a parabolic region.
Case 2 (Edges) Let F be the set of all the edges, and E = {x : x2 =

mx1 + c, l1 ≤ x1 ≤ u1 } ∈ F be an edge between vertices l and u, then f ∗ (s) =
supx∈ri(E) {s, x − (r(x) + IP (x))}. By computing the critical points, we have
s − (∇r(x) + NP (x)) = 0 where NP (x) = {s : s = λ(−m, 1), λ ≥ 0} with m the
slope of the edge. So
s1 = −ξ21 t2 + 2ξ11 t + ξ01 − mλ

(5)
s2 = −ξ22 t2 + 2ξ12 t + ξ02 + λ
ξ11 x1 + ξ12 x2 + ξ10

where t = . Since x ∈ ri(E), we have
ξ21 x1 + ξ22 x2 + ξ20
x2 = mx1 + c, (6)
which with (5) gives

⎧
⎪
⎨γ10 s1 + γ01s2 + γ00 when ξ21 + mξ22 = 0
x1 = γ00 ± γ1/2 γ10/2 s1 + γ01/2 s2 + γ00/2 (7)
⎪
⎩ ±γ otherwise,
−1/2 γ s +γ
10/2 1 s +γ 01/2 2 00/2
where all γij and γij/k are defined in the coefficients of r, and parameters m and
c. When ξ21 + mξ22 = 0, solving (5) and (6), leads to a quadratic equation in t
with coefficients as linear functions in s.
By substituting (7) and (6) in f ∗ (s), when ξ21 + mξ22 = 0, we have
ψ (s , s )
f ∗ (s) = 1 1 2 + ψ0 (s1 , s2 ),
ζ00 ψ1/2 (s1 , s2 )
and when ξ21 + mξ22 = 0,
f ∗ (s) = ζ11 s21 + ζ12 s1 s2 + ζ22 s22 + ζ10 s1 + ζ01 s2 + ζ00 ,
where all ζij , ψi and ψi/j are defined in the coefficients of r, and parameters m
and c, with ψi (s) and ψi/j(s) linear functions in s.
From Proposition 2, x∈ri(E) ∂f (x) is either a parabolic region or a ray.
So for any E, the conjugate
is a fractional function of form (2) defined over a
parabolic region. When x∈ri(E) ∂f (x) is a ray, the computation of the conjugate
is deduced from its neighbours by continuity.
Case 3 (Interior) Since x∈int(P ) ∂f (x) is contained in a parabolic arc
(from Corollary 1), the computation of the conjugate is deduced by continuity.
x22
Example 2. For a bivariate rational function r(x) = defined over
x2 − x1 + 1
a polytope P with vertices v1 = (1, 1), v2 = (1, 0) and v3 = (0, 0), let f (x) =
r(x) + IP (x).
The conjugate (shown in Fig. 2) can be written

⎧
⎪
⎪ 0 s ∈ R1
⎪
⎪
⎪
⎪ s ∈ R2
⎨s1
fP∗ (s) = s1 s ∈ R3
⎪
⎪
⎪
⎪ s1 + s2 − 1 s ∈ R4
⎪
⎪
⎩ 1 (s + s )2 s ∈ R
4 1 2 5
where
R1 = {s : s2 ≥ −s1 + 2, s2 ≥ 1}
R2 = {s : s2 ≥ s1 , s21 + 2s1 s2 − 4s1 + s22 ≤ 0}
R3 = {s : s2 ≤ s1 , s2 ≤ 1, s1 ≥ 0}
R4 = {s : 0 ≤ s1 , s2 ≤ −s1 }
R5 = {s : s2 ≥ −s1 , s2 ≤ −s1 + 2, s2 ≥ s1 , s21 + 2s1 s2 − 4s1 + s22 ≥ 0}.
Fig. 2. Conjugate for Example 2
5 Conclusion and Future Work

Figure 3 summarizes the strategy. Given a PLQ function, for each piece, its
convex envelope is computed as the convex envelope of a quadratic function over
a polytope using [16]. This is the most time consuming operation since the known
algorithms are at least exponential. For each piece, we obtain a piecewise rational
function. Then we take each of those pieces, and compute its conjugate to obtain
a fractional function over a parabolic subdivision. That computation is complete

except for Conjecture 1. Note that there is only a single problematic vertex v
and since the conjugate is full domain, we can deduce ∂f (v) by elimination.
Future work will focus on Step 3, which will give the conjugate of the original
PLQ function. This will involve solving repeatedly the map overlay problem and
is likely to take exponential time. From hundred of examples we ran, we expect
the result to be a fractional function of unknown kind over a parabolic subdi-
vision; see Fig. 3, bottom row, middle figure. The final step will be to compute
the biconjugate (bottom-left in Fig. 3. We know it is a piecewise function over a
polyhedral subdivision but do not know the formulas.
Step 1
Qi, Pi [Loc16] ri, Pi
Qi, Pi Convex Envelope
ri, Pi
PLQ
Dual domain Dual domain
Step 2
?
Step 4 Step 3
frj, Prj
?, Pi Conjugate Max Conjugate
Conjugate
Biconjugate
Fig. 3. Summary
References
1. Al-Khayyal, F.A., Falk, J.E.: Jointly constrained biconvex programming. Math.
Oper. Res. 8(2), 273–286 (1983)
2. Anstreicher, K.M.: On convex relaxations for quadratically constrained quadratic
programming. Math. Program. 136(2), 233–251 (2012)
3. Bauschke, H.H., Goebel, R., Lucet, Y., Wang, X.: The proximal average: basic
theory. SIAM J. Optim. 19(2), 766–785 (2008)
4. Bauschke, H.H., Lucet, Y., Trienis, M.: How to transform one convex function
continuously into another. SIAM Rev. 50(1), 115–132 (2008)
5. Brenier, Y.: Un algorithme rapide pour le calcul de transformées de Legendre-
Fenchel discretes. Comptes rendus de l’Académie des sciences. Série 1,
Mathématique 308(20), 587–589 (1989)
6. Corrias, L.: Fast Legendre-Fenchel transform and applications to Hamilton-Jacobi
equations and conservation laws. SIAM J. Numer. Anal. 33(4), 1534–1558 (1996)
7. Crama, Y.: Recognition problems for special classes of polynomials in 0–1 variables.
Math. Program. 44(1–3), 139–155 (1989)
8. Gardiner, B., Jakee, K., Lucet, Y.: Computing the partial conjugate of convex
piecewise linear-quadratic bivariate functions. Comput. Optim. Appl. 58(1), 249–
272 (2014)
9. Gardiner, B., Lucet, Y.: Convex hull algorithms for piecewise linear-quadratic func-
tions in computational convex analysis. Set-Valued Var. Anal. 18(3–4), 467–482
(2010)
10. Gardiner, B., Lucet, Y.: Graph-matrix calculus for computational convex analysis.
In: Fixed-Point Algorithms for Inverse Problems in Science and Engineering, pp.
243–259. Springer (2011)
11. Gardiner, B., Lucet, Y.: Computing the conjugate of convex piecewise linear-
quadratic bivariate functions. Math. Program. 139(1–2), 161–184 (2013)
12. Haque, T., Lucet, Y.: A linear-time algorithm to compute the conjugate of convex
piecewise linear-quadratic bivariate functions. Comput. Optim. Appl. 70(2), 593–
613 (2018)
13. Hiriart-Urruty, J.B., Lemaréchal, C.: Convex analysis and minimization algorithms
II: Advanced Theory and Bundle Methods. Springer Science & Business Media
(1993)
14. Jach, M., Michaels, D., Weismantel, R.: The convex envelope of (n-1)-convex func-
tions. SIAM J. Optim. 19(3), 1451–1466 (2008)
15. Linderoth, J.: A simplicial branch-and-bound algorithm for solving quadratically
constrained quadratic programs. Math. Program. 103(2), 251–282 (2005)
16. Locatelli, M.: A technique to derive the analytical form of convex envelopes for
some bivariate functions. J. Glob. Optim. 59(2–3), 477–501 (2014)
17. Locatelli, M.: Polyhedral subdivisions and functional forms for the convex
envelopes of bilinear, fractional and other bivariate functions over general poly-
topes. J. Glob. Optim. 66(4), 629–668 (2016)
18. Lucet, Y.: A fast computational algorithm for the Legendre-Fenchel transform.
Comput. Optim. Appl. 6(1), 27–57 (1996)
19. Lucet, Y.: Faster than the fast Legendre transform, the linear-time Legendre trans-
form. Numer. Algorithms 16(2), 171–185 (1997)
20. Lucet, Y.: Fast Moreau envelope computation i: Numerical algorithms. Numer.
Algorithms 43(3), 235–249 (2006)
21. Lucet, Y.: What shape is your conjugate? A survey of computational convex anal-
ysis and its applications. SIAM Rev. 52(3), 505–542 (2010)
22. Lucet, Y., Bauschke, H.H., Trienis, M.: The piecewise linear-quadratic model for
computational convex analysis. Comput. Optim. Appl. 43(1), 95–118 (2009)
23. McCormick, G.P.: Computability of global solutions to factorable nonconvex pro-
grams: Part iconvex underestimating problems. Math. Program. 10(1), 147–175
(1976)
24. Rockafellar, R.T., Wets, R.J.B.: Variational Analysis, vol. 317. Springer Science &
Business Media (1998)
25. Sherali, H.D., Alameddine, A.: An explicit characterization of the convex envelope
of a bivariate bilinear function over special polytopes. Ann. Oper. Res. 25(1),
197–209 (1990)
26. Tardella, F.: On the existence of polyhedral convex envelopes. In: Frontiers in
Global Optimization, pp. 563–573. Springer (2004)
27. Tardella, F.: Existence and sum decomposition of vertex polyhedral convex
envelopes. Optim. Lett. 2(3), 363–375 (2008)
Tractable Relaxations for the Cubic
One-Spherical Optimization Problem
Christoph Buchheim1 , Marcia Fampa2(B) , and Orlando Sarmiento2

1
Technische Universität Dortmund, Dortmund, Germany
christoph.buchheim@math.tu-dortmund.de
2
Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
{fampa,osarmiento}@cos.ufrj.br
Abstract. We consider the cubic one-spherical optimization problem,

consisting in minimizing a homogeneous cubic function over the unit
sphere. We propose different lower bounds that can be computed effi-
ciently, using decompositions of the objective function and well-known
results for the corresponding quadratic problem variant.
Keywords: Cubic one-spherical optimization problem ·

Best rank-1 tensor approximation · Trust region subproblem ·
Convex relaxation
1 Introduction
The cubic one-spherical optimization problem has the following form:

n
CSP : minn f (x) := Ax = 3
aijk xi xj xk
x∈R
i,j,k=1
s.t. x = 1,
where n ≥ 2 and A is a third-order (n×n×n)-dimensional real symmetric tensor.

Tensor A is symmetric in the sense that its element aijk is invariant under any
permutation of the indices (i, j, k). As shown by Zhang et al. [9], using a result
by Nesterov [6], Problem CSP is NP-hard. This is in contrast to the quadratic
variant of CSP, which is the well-known trust region subproblem. In spite of the
non-convexity, the latter quadratic problem can be solved efficiently and has a
concave dual problem with no duality gap [8]. We will use the latter results in
one of the approaches presented in this paper.
Applications for the cubic one-spherical optimization problem can be found
in signal processing, where a discrete multidimensional signal is treated as a
tensor, and the low-rank approximation of the tensor is used to approximate the
signal. The CSP, for n = 3, is also used to formulate magnetic resonance signals
in biological tissues. Some of these applications are described in [9] and in the
references therein [2–5,7,8].
https://doi.org/10.1007/978-3-030-21803-4_28
268 C. Buchheim et al.
Our main purpose in this work is to develop new and efficient relaxations for
problem CSP. For that, we propose different approaches, described in Sect. 3. In
Sect. 4, we present preliminary numerical results concerning the quality of the
resulting lower bounds and the computational effort to compute them, for small
instances from the literature as well as larger randomly generated instances on
up to n = 200 variables.
2 Notation and Preliminaries

We denote by S 3 (Rn ) the space of third-order (n × n × n)-dimensional real
symmetric tensors. Given the symmetric tensor A ∈ S 3 (Rn ) appearing in Prob-
lem CSP and some ∈ {1, . . . , n}, we define A as the symmetric matrix com-
posed by the elements ajk of A, for all j, k = 1, . . . , n. Moreover, by Ã we
denote the submatrix of A where the -th row and column are eliminated.
For example, for n = 3, we have
⎛ ⎞
a111 a112 a113 a211 a212 a213 a311 a312 a313
A := ⎝ a121 a122 a123 a221 a222 a223 a321 a322 a323 ⎠ = (A1 , A2 , A3 )
a131 a132 a133 a231 a232 a233 a331 a332 a333
with
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
a111 a112 a113 a211 a212 a213 a311 a312 a313
A1 = ⎝ a121 a122 a123 ⎠ , A2 = ⎝ a221 a222 a223 ⎠ , A3 = ⎝ a321 a322 a323 ⎠ ,
a131 a132 a133 a231 a232 a233 a331 a332 a333
and,
a122 a123 a211 a213 a311 a312
Ã1 := , Ã2 := , Ã3 := .
a132 a133 a231 a233 a321 a322
In the following, for a symmetric real matrix X, we will denote by λmin (X) the
smallest eigenvalue of X. Given a vector x ∈ Rn and ∈ {1, . . . , n}, we define the
vector xˆ ∈ Rn−1 as xˆ := (x1 , . . . , x−1 , x+1 , . . . , xn ), i.e., the vector x where
the -th component is omitted.
3 Relaxations for the CSP

Our objective is to compute lower bounds for Problem CSP that can be efficiently
calculated. For this, we
relax the problem in different ways. The common idea is
n
to decompose the sum i,j,k=1 aijk xi xj xk appearing in the objective function
of CSP into pieces that can be minimized over the constraint ||x|| = 1 efficiently.
Combining all minima then yields a lower bound for CSP.
Tractable Relaxations for the Cubic One-Spherical Optimization Problem 269
3.1 Lower Bound by Decomposition – Approach 1
We first decompose the objective function of CSP by the first index, as follows:

n n

n
n
aijk xi xj xk = xi aijk xj xk + 2x2i aiij xj + aiii x3i (1)
i,j,k=1 i=1 j,k=1 j=1
j,k=i j=i
Then we conclude

n
min aijk xi xj xk ≥
x=1
i,j,k=1
n

n
n (2)
min xi aijk xj xk + 2x2i aiij xj + aiii x3i .
x=1
i=1 j,k=1 j=1
j,k=i j=i
By a further decomposition, for each i = 1, 2, . . . , n, we have

n
n
min xi aijk xj xk + 2x2i aiij xj + aiii x3i ≥
x=1
j,k=1 j=1
j,k=i j=i

n
min xi min
√ aijk xj xk
xi ∈[−1,1] xî = 1−x2i j,k=1
j,k=i
(3)

n
+ min 2x2i min
√ aiij xj
xi ∈[−1,1] xî = 1−x2i j=1
j=i
+ min aiii x3i .

x=1
We now consider each problem on the right-hand-side of (3) independently. First,

note that
n
min
√ aijk xj xk = (1 − x2i )λmin (Ãi ), (4)
xî = 1−x2i j,k=1
j,k=i
using the notation of Sect. 2. Multiplying the right hand side with xi and taking
the minimum over xi ∈ [−1, 1], we obtain

n 2√
min xi min
√ aijk xj xk = − 3 |λmin (Ãi )| . (5)
xi ∈[−1,1] xî = 1−x2i j,k=1
9
j,k=i
Moreover,
n

n
min
√ aiij xj = (−1)
a2
iij 1 − x2i (6)
xî = 1−x2i j=1 j=1
j=i j=i
and hence

n

n
4√
min 2x2i min
√ aiij xj =− 3
a2iij . (7)
xi ∈[−1,1] xî = 1−x2i j=1
9 j=1
j=i j=i
Finally,
min aiii x3i = −|aiii | . (8)
x=1
Adding up, from (2), (3), and (5)–(8) we obtain

n
min aijk xi xj xk ≥
x=1
i,j,k=1 ⎛ ⎞

n
⎜2√ 4√ n
⎟
− ⎜ 3 |λ (Ã )| + 3 a2iij + |aiii |⎟
⎝9 min i
9 ⎠.
i=1 j=1
j=i
The time to calculate this lower bound is dominated by computing the smallest
eigenvalues of the n symmetric (n − 1) × (n − 1)-matrices Ã1 , . . . , Ãn .
3.2 Lower Bound by Duality – Approach 2

In the second approach, our aim is to decompose the objective function of CSP
into fewer terms, thus hoping to obtain a stronger lower bound. For this, we use

n
n
n
min aijk xi xj xk ≥ min aijk xi xj xk .
x=1 x=1
i,j,k=1 i=1 j,k=1
Note that, for each i = 1, 2, . . . , n fixed, we obtain

n
min aijk xi xj xk
x=1
j,k=1

n
= min min
√ aijk xi xj xk
xi ∈[−1,1] x = 1−x2i j,k=1
î

n
n
= min min aijk xi (1 − x2i )yj yk + 2 aiij x2i 1 − x2i yj + aiii x3i
xi ∈[−1,1] y=1
j,k=1 j=1
j,k=i j=i

= min min xi (1 − x2i )y Ãi y + x2i 1 − x2i a
i y + a 3
iii i ,
x (9)
xi ∈[−1,1] y=1
where we set
y := √ 1
(x1 , x2 , . . . , xi−1 , xi+1 , . . . , xn ) =√ 1
x ∈ Rn−1 ,
1−x2i 1−x2i î
ai := 2(aii1 , aii2 , . . . , aii(i−1) , aii(i+1) , . . . , aii(n−1) , aiin ) ∈ Rn−1 .

In the following, we take advantage of the spectral decomposition of Ãi . For

this, let λmin (Ãi ) = λi1 ≤ . . . ≤ λin be the eigenvalues of Ãi , and vi1 , . . . , vin
be a corresponding orthonormal basis of eigenvectors. We have Ãi = Vi Λi Vi ,
where Vi := (vi1 , . . . , vin ) ∈ R(n−1)×(n−1) , with Vi Vi = Vi Vi = In−1 , and
Λi := Diag(λi1 , . . . , λin ). For each i = 1, . . . , n, we then have

2
min xi (1 − xi )y Ãi y + xi 1 − xi ai y + aiii xi =
2 2 3
y=1
(10)
2
min xi (1 − xi )z Λi z + xi 1 − xi bi z + aiii xi ,
2 2 3
z=1
where we substitute z := Vi y and bi := Vi ai . Recall that xi is a constant in

this context.
Note that Problem (10) aims at minimizing a quadratic function over the
unit sphere, i.e., it is an instance of the so-called trust region subproblem and
can thus be solved efficiently. A dual problem of (10), with no duality gap, is
presented in [8]. For its solution, three cases have to be distinguished. In all
cases, a lower bound on (10) is given by

1 4 2
−1
max μ − xi (1 − xi )bi xi (1 − xi )Λi − μI
2 3
bi + aiii xi .
xi (1−x2i )Λi −μI0 4
(11)
Assuming for simplicity that bi has no zero entries, we have to distinguish
between the cases xi ∈ {−1, 0, 1} and xi ∈ (−1, 0) ∪ (0, 1). In the latter case, we
have x4i (1−x2i ) = 0 and hence the maximizer of (11) will lie in the interior of the
feasible set (the so-called “easy case”). In the former case, (10) turns out to be
trivial, with optimal value aiii x3i . However, we aim at deriving a bound without
knowing xi in advance, so we cannot apply this case distinction a priori, which
means we have to use the lower bound given by (11) in all cases.
Let us now further divide the case xi ∈ (−1, 0) ∪ (0, 1) into two subproblems,
namely, xi being negative or positive. Assuming xi ∈ (−1, 0), we may replace μ
by −xi (1 − x2i )α in (11), obtaining

1 −1
max −xi (1 − x2i )α − x4i (1 − x2i )b
i xi (1 − x2
i )(Λi + αI) b i + a iii x3
i .
−Λi −αI0 4
Therefore, for xi ∈ (−1, 0), we obtain a lower bound

2
min xi (1 − xi )y Ãi y + xi 1 − xi ai y + aiii xi ≥
2 2 3
y=1 (12)
1 3 −1
max −xi (1 − xi )α − xi bi (Λi + αI) bi + aiii xi .
2 3
α<−λin 4
Analogously, when xi ∈ (0, 1), we replace μ by xi (1 − x2i )α and obtain

min xi (1 − x2i )y Ãi y + x2i 1 − x2i ai y + a x3
iii i ≥
y=1 (13)
1
max xi (1 − x2i )α − x3i b i (Λ i − αI) −1
b i + a x3
iii i .
α<λi1 4
Taking into account that we only aim at finding lower bounds for CSP, our
strategy is to fix α = −λin − in (12) and α = λi1 − in (13), with > 0, for
each i = 1, . . . , n. We thus obtain a lower bound as

n
n
min aijk xi xj xk ≥ min {ωi , νi , −|aiii |} (14)
x=1
i,j,k=1 i=1
where

n
x3i (vij ai )2
ωi := min xi (1 − x2i )(λin + ) − + aiii x3i ,
xi ∈(−1,0)
j=1
4(λij − λin − )

(15)
n
x3i (vij ai )2
νi := min xi (1 − x2i )(λi1 − ) − + aiii x3i .
xi ∈(0,1)
j=1
4(λij − λi1 + )
The value −|aiii | in (14) covers the case xi ∈ {−1, 0, 1}. Note that the minimiza-
tion problems (15) are univariate polynomial optimization problems of degree 3
and hence easily solved by a closed formula.
The computational effort for computing this bound is dominated by the
diagonalization of the n symmetric (n − 1) × (n − 1)-matrices Ã1 , . . . , Ãn .
4 Numerical Results
We implemented the routines to compute lower bounds to the cubic one-spherical
optimization problem CSP, based on the approaches described in the previous
section, in MATLAB R2017b. Our experiments were run on a cluster of 64-bit
Intel(R) Xeon(R) E5-4620 processors running at 2.20 GHz with 252.4 GB of
memory.
We solve both problems in (15) by a closed formula, for 100 different values
of equally distributed in the interval [10−5 , 5], and report the best (i.e., largest)
bound obtained.
4.1 Discretization – Approach 3

In order to evaluate the potential of our decomposition approach, we additionally
consider inequality (9) for each i = 1, 2, . . . , n and solve each quadratic problem

2
min xi (1 − xi )y Ãi y + xi 1 − xi ai y + aiii xi
2 2 3
(16)
y=1
using the algorithm introduced by Lucidi and Palagi [6]. This however does not
yield a closed form solution, so that, different from the previous sections, we
cannot minimize the resulting expression exactly over xi ∈ [−1, 1]. Instead, we
discretize the interval [−1, 1] and use the smallest values obtained for any grid
point. Note however that this approach does not yield a safe lower bound in
general, as we cannot estimate the error incurred by the discretization.
4.2 Small Instances from the Literature

We first consider three small examples taken from [7].
Example 1 (Example 3.2 in [7]). Here A ∈ S 3 (R3 ) is defined by
A111 = −0.1281, A112 = 0.0516, A113 = −0.0954,
A122 = −0.1958, A123 = −0.1790, A133 = −0.2679,
A222 = 0.3251, A223 = 0.2513, A233 = 0.1773, A333 = 0.0338.
Example 2 (Example 3.3 in [7]). Here we have A ∈ S 3 (R3 ) with entries
A111 = 0.0517, A112 = 0.3579, A113 = 0.5298, A122 = 0.7544, A123 = 0.2156,
A133 = 0.3612, A222 = 0.3943, A223 = 0.0146, A233 = 0.6718, A333 = 0.9723.
Example 3 (Example 3.5 in [7]). Here we have A ∈ S 3 (R5 ) with entries
(−1)i1 (−1)i2 (−1)i3
Ai1 ,i2 ,i3 = i1 + i2 + i3 .
In Table 1, we show the lower bounds obtained by the three proposed
approaches for these instances. The best known solutions depicted in the table
were given in [7] and computed by the heuristic approach presented in the paper.
Table 1. Results for instances from the literature
Problem Approach 1 Approach 2 Approach 3 Best solution

Ex 1 −1.3172 −1.2683 −1.0849 −0.8730
Ex 2 −3.3009 −3.1877 −2.9730 −2.1110
Ex 3 −20.9114 −18.5364 −15.6254 −9.9779
As mentioned above, Approach 3 does not give a rigorous lower bound, since
instead of globally solving problem (16) for each i, we obtain the best optimal
solution among the problems where xi is fixed at a point in a discretization set in
interval [−1, 1]. To have a better idea of the quality of these solutions we did an
experiment where we obtain a solution for (16), for each i, 10 times. At first we
have only 5 discretized points. Then, at each iteration k = 2, . . . , 10, we add 50k
more points to the discretization set. The points added are always equidistant.
At the last iteration, we consider 2255 points. The lower bounds obtained for
different numbers of discretization points (npoints) are depicted in Table 2. We
observe that when increasing the number of points in the discretization set from
155, the solutions obtained are very similar to each other. The percentage relative
difference shown in the last column, given by
rel.dif := (lower.bound(k − 1) − lower.bound(k))/|lower.bound(k − 1)| ∗ 100,
is always smaller than 0.04% when k > 2, for the two first examples, and smaller
than 0.16% for Example 3. These results suggest that the lower bounds obtained
by Approach 3 quickly converge to valid bounds when the number of discretiza-
tion points increases.
4.3 Random Instances
Finally, we generated random third-order (n×n×n)-dimensional real symmetric

tensors A, with entries uniformly distributed in (0, 1). The tensors were gener-
ated using the open source software Tensor Toolbox for MATLAB [1]. Tables 3
and 4 report average bounds for 20 instances for each n = 3, 5, 10, 30, 50, 100, 200.
Table 2. Approach 3 – Results for different numbers of discretization points
Example 1
it npoints lower bound rel dif
1 5 –7.661209e-001 -
2 55 –1.082002e+000 41.2312600
3 155 –1.083636e+000 0.1510226
4 305 –1.083790e+000 0.0142051
5 505 –1.084534e+000 0.0686056
6 755 –1.084610e+000 0.0070751
7 1055 –1.084610e+000 0.0000000
8 1405 –1.084610e+000 0.0000000
9 1805 –1.084610e+000 0.0000000
10 2255 -1.084688e+000 0.0071920
Example 2
1 5 –2.117380e+000 -
2 55 –2.972419e+000 40.3819833
3 155 –2.973344e+000 0.0311089
4 305 –2.973650e+000 0.0102988
5 505 –2.973650e+000 0.0000000
6 755 –2.973650e+000 0.0000000
7 1055 –2.973688e+000 0.0012582
8 1405 –2.973770e+000 0.0027572
9 1805 –2.973770e+000 0.0000000
10 2255 –2.973770e+000 0.0000000
Example 3
1 5 –7.405235e+000 -
2 55 –1.051328e+001 41.9709771
3 155 –1.131087e+001 7.5864276
4 305 –1.131287e+001 0.0176827
5 505 –1.131301e+001 0.0012360
6 755 –1.131359e+001 0.0051961
7 1055 –1.131359e+001 0.0000000
8 1405 –1.131405e+001 0.0039842
9 1805 –1.133194e+001 0.1581602
10 2255 –1.133194e+001 0.0000000
For Approach 3, for each i = 1, . . . , n, we solve the quadratic problem (16), for
200 equally spaced points xi in the interval [−1, 1], using the algorithm described
in [6]. The computational time needed for this approach is large and significantly
increases with n. Therefore, we only apply it for the smallest instance in Table 3.
We emphasize once more that the main objective of applying Approach 3 is to
have an evaluation of the quality of the lower bounds computed by the other
approaches. Note that, for all instances n = 3 in case the number of discretized
points in Approach 3 approaches infinity, we should have its solution converg-
ing to the best possible bound given by Approach 2. For the larger instances in
Table 4, we apply our two approaches actually intended to generate lower bounds
for the CSP.
Table 3. Results for random instances, n = 3.
Approach Lower bound Time

1 −3.4802001 0.002
2 −3.4515856 0.043
3 −3.4480189 113.790
Table 4. Results for random instances, n = 5, 10, 30, 50, 100, 200.

1 −7.1952054 0.0014
2 −8.2862354 0.0807
1 −19.1051160 0.0016
2 −28.4392343 0.1656
1 −93.9774938 0.0097
2 −220.1731727 0.5579
1 −197.1711761 0.0587
2 −581.0479091 1.1046
1 −540.4581602 0.6257
2 −2203.4444732 5.8779
1 −1495.4581060 9.9779
2 −8483.9467825 36.8817
We note that the computational time of Approach 2 can still be reduced,

because the computational times reported for Approach 2 take into account the
solution of the minimization problems in (15) for 100 different values of . This
strategy can be made more practical and more efficient, by a better analysis of
the problem with the aim of reducing the number of dual solutions considered
and improve the quality of the bounds. The improvement on these computations
is part of our future research.
Acknowledgments. C. Buchheim has received funding from the European Unions

Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie
grant agreement No 764759. M. Fampa was supported in part by CNPq-Brazil grants
303898/2016-0 and 434683/2018-3. O. Sarmiento contributed much of his work while
visiting the Technische Universität Dortmund, Dortmund, Germany, supported by a
Research Fellowship from CAPES-Brazil - Finance Code 001.
References
1. Bader, B.W., Kolda, T.G., et al.: MATLAB Tensor Toolbox Version 3.0-dev, Oct
2017. https://www.tensortoolbox.org
2. Basser, P.J., Mattiello, J., LeBihan, D.: MR diffusion tensor spectroscopy and imag-
ing. Biophys. J. 66, 259–267 (1994)
3. Basser, P.J., Mattiello, J., LeBihan, D.: Estimation of the effective seldiffusion tensor
from the NMR spin echo. J. Magn. Reson. B 103, 247–254 (1994)
4. Basser, P.J., Jones, D.K.: Diffusion-tensor MRI: theory, experimental design and
data analysis-a technical review. NMR Biomed. 15, 456–467 (2002)
5. Liu, C.L., Bammer, R., Acar, B., Moseley, M.E.: Characterizing non-gaussian diffu-
sion by using generalized diffusion tensors. Magn. Reson. Med. 51, 924–937 (2004)
6. Nesterov, Y.E.: Random walk in a simplex and quadratic optimization over convex
polytopes. CORE Discussion Paper 2003/71 CORE-UCL (2003)
7. Nie, J., Wang, L.: Semidefinte relaxations for best rank-1 tensor approximations.
SIAM J. Matrix Anal. Appl. 35, 1155–1179 (2014)
8. Stern, R.J., Wolkowicz, H.: Indefinite trust region subproblems and nonsymmetric
eigenvalue perturbations. SIAM J. Optim. 5, 286–313 (1995)
9. Zhang, X., Qi, L., Ye, Y.: The cubic spherical optimization problems. Math. Com-
put. 81(279), 1513–1525 (2012)
DC Programming and DCA
A DC Algorithm for Solving
Multiobjective Stochatic Problem via
Exponential Utility Functions
Ramzi Kasri(B) and Fatima Bellahcene
Faculty of Sciences, LAROMAD, Mouloud Mammeri University, BP 17 RP, 15000

Tizi-Ouzou, Algeria
ramzi.kasri@ummto.dz, bellahcene.fat@gmail.com
Abstract. In this paper we suggest an algorithm for solving a multiob-

jective stochastic linear programming problem with normal multivariate
distributions. The problem is first transformed into a deterministic multi-
objective problem introducing the expected value criterion and an utility
function. The obtained problem is reduced to a monobjective quadratic
problem using a weighting method. This last problem is solved by DC
algorithm.
Keywords: Multiobjective programming · Stochastic programming ·

DCA · DC programming · Utility function · Expected value criterion
1 Introduction
Multiobjective stochastic linear programming (MOSLP) is a tool for modeling
many concrete real-life problems because it is not obvious to have the complete
data about problems parameters. Such a class of problems includes investment
and energy resources planning [1,20], manufacturing systems in production plan-
ning [7,8], mineral blending [12], water use planning [2,5] and multi-product
batch plant design [23]. So, to deal with this type of problems it is required to
introduce a randomness framework.
In order to obtain the solutions for these multiobjective stochastic problems,
it is necessary to combine techniques used in stochastic programming and multi-
objective programming. From this, two approaches are considered, both of them
involve a double transformation. The difference between the two approaches is
the order in which the transformations are carried out. Ben Abdelaziz qualified
as multiobjective approach the perspective which transform first, the stochastic
multiobjective problem into its equivalent multiobjective deterministic problem,
and stochastic approach the techniques that transform in first the stochastic
multiobjective problem into a monobjective stochastic problem [4].
Several interactive methods for solving (MOSLP) problems have been devel-
oped. We can mention the Probabilistic Trade-off Development Method or
PROTRADE by Goicoechea et al. [10]. The Strange method proposed by
https://doi.org/10.1007/978-3-030-21803-4_29
280 R. Kasri and F. Bellahcene
Teghem et al. [21] and the interactive method with recourse which uses a two
stage mathematical programming model by Klein et al. [11].
In this paper, we propose another approach which is a combination between
the multiobjective approach and a nonconvex technique (Difference of Convex
functions), to solve the multiobjective stochastic linear problem with normal
multivariate distributions. The DC programming and DC Algorithm have been
introduced by Pham Dinh Tao in 1985 and developed by Le Thi and Pham Dinh
since 1994 [13–16]. This method has proved its efficiency in a large number of
noncovex problems [17–19].
The paper is structured as follows: In Sect. 2, the problem formulation is
given. Section 3, shows how to reformulate the problem by introducing utility
functions and applying the weighting method. Section 4 presents a review of DC
programming and DCA. Section 5 illustrates the application of DC programming
and DCA for the resulting quadratic problem. Our, experimental results are
presented in the last section.
2 Problem Statement
Let us consider the multiobjective stochastic linear programming problem for-
mulated as follows:
min (c̃1 x, c̃2 x, ..., c̃q x),
(1)
s.t. x ∈ S,
where x = (x1 , x2 , ..., xn ) denotes the n-dimensional vector of decision variables.
The feasible set S is a subset of n-dimensional real vector space IRn characterized
by a set of constraint inequalities of the form Ax ≤ b; where A is an m × n
coefficient matrix and b an m-dimensional column vector. We assume that S
is nonempty and compact in IRn . Each vector c̃k follows a normal distribution
with mean ck and covariance matrix Vk . Therefore, every objective c̃k x follows
a normal distribution with mean μk = c̄k x and variance σk2 = xt Vk x.
In the following section, we will be mainly interested in the main way to
transform problem (1) into an equivalent multiobjective deterministic problem
which in turn will be reformulated as a DC programming problem.
3 Transformations and Reformulation
First, we will take into consideration the notion of risk. Assuming that deci-
sion makers’ preferences can be represented by utility functions, under plausible
assumptions about decision makers’s risk attitudes, problem (1) is interpreted
as:
min(E[U (c̃1 x)], E[U (c̃2 x)], ..., E[U (c̃q x)]),
x (2)
s.t. x ∈ S.
The utility function U is generally assumed to be continuous and convex. In this
paper, we consider an exponential utility function of the form U (r) = 1 − e−ar ,
A DC Algorithm for Solving Multiobjective Stochatic Problem 281
where r is the value of the objective and a the coefficient of incurred risk (a
large corresponds to a conservative attitude). Our choice is motivated by the fact
that exponential utility functions will lead to an equivalent quadratic problem
which encouraged us to design a DC method to solve it simply and accurately.
Therefore, if r ∼ N (μ, σ 2 ), we have:
2 2
+∞
−ar e−(r−μ) /2σ dr σ 2 a2
E(U (r)) = (1 − e ) √ = 1 − e 2 −μa .
−∞ 2π σ
2 2 2
Minimizing E(U (r)) means maximizing σ 2a − μa or minimizing μ − σ2a .
Our aim is to search for efficient solutions of the multiobjective deterministic
problem (2) according to the following definition:
Definition 1. [3] A feasible solution x∗ to problem (1) is an efficient solution

if there is not another feasible x such that E[U (c̃k x))] ≥ E[U (c̃k x∗ )] with at
least one strict inequality. The resulting criterion vector E[U (c̃k x∗ )] is said to
be non-dominated.
Applying the widely used method for finding efficient solutions in multiobjec-
tive programming problems, namely the weighting sum method [3,6], we assign
to each objective function in (2) a non-negative weight wk and aggregate the
objectives functions in order to obtain a single function. Thus, problem (2) is
reduced to:
q
min wk E[U (c̃k x)],
x k=1
(3)
s.t. x ∈ S,
wk ∈ Λ ∀k ∈ {1, . . . , q},
or equivalently

q
minE[U ( wk c̃k x)],
x k=1 (4)
s.t. x ∈ S,
wk ∈ Λ ∀k ∈ {1, . . . , q},

q
where Λ = {wk : wk = 1, wk ≥ 0 ∀k ∈ {1, . . . , q}}.
k=1
Theorem 1. [9] A point x∗ ∈ S is an efficient solution to problem (2) if and

only if x∗ ∈ S is optimal for problem (4).

q
Given that the random variable F (x, c̃) = wk c̃k x in (4) is a linear function
k=1
of the random objectives c̃k x; its variance depends on the variances of c̃k x and
on their covariances. Since each c̃k x follows a normal distribution with mean μk
and covariance σk2 , the function F (x, c̃) follows a normal distribution with mean
μ and covariance σ 2 where,

q
q
μ= μk = wk c̄k x, (5)
k=1 k=1

q
q
σ2 = wk2 σk2 + 2 wk ws σks , (6)
k=1 k,s=1
where σks denotes the covariance of the random objectives c̄k x and c̄s x. Finally,
we obtain the following quadratic problem:
⎛ ⎞
q q q
min wk c̄tk x − a2 ⎝ wk2 σk2 + 2 wk ws σks ⎠ ,
x k=1 k=1 k,s=1 (7)
k<s
s.t. x ∈ S,
or ⎛ ⎞

q
q
q
min wk c̄tk x − a ⎝ wk2 xt Vk x + 2 wk ws xt Vks x⎠ ,
x 2 (8)
k=1 k=1 k,s=1
k<s
s.t. x ∈ S,
where c̄k = (c̄k1 , c̄k2 , ..., c̄kn ) is the k-th component of the expected value of the
random multinormal vector c̃, Vks and Vk are elements of the positive definite
covariance matrix V of c̃:
⎛ ⎞
V1 V12 . . . V1s . . . V1q
⎜ V21 V2 . . . V2s . . . V2q ⎟
⎜ ⎟
⎜ ... ... ... ... ... ... ⎟
V =⎜ ⎜ ⎟.
⎟
⎜ Vk1 Vk2 . . . Vks . . . Vkq ⎟
⎝ ... ... ... ... ... ⎠
Vq1 Vq2 . . . Vqs . . . Vq
4 Review of DC Programming and DCA
A general DC program has the form:
α = inf {f (x) = g(x) − h(x) : x ∈ IRn }, (9)
where g, h are lower semicontinuous proper convex functions on IRn called DC

components of the DC function f while g − h is a DC decomposition of f .
The duality in DC associates to problem (9) the following dual program:
α = inf {h∗ (y) − g ∗ (y) : y ∈ IRn }, (10)
where g ∗ and h∗ are respectively the conjugate functions of g and h.

The conjugate function of g is defined by:
g ∗ (y) = sup{x, y − g(x) : x ∈ IRn }. (11)
From [15], the most used necessary optimality conditions for problem (9), is:
∅ = ∂h(x∗ ) ⊂ ∂g(x∗ ), (12)

where ∂h(x∗ ) = {y ∗ ∈ IRn : h(x) ≥ h(x∗ )+ < x − x∗ , y ∗ >, ∀x ∈ IRn } is the

subdifferential of h at x∗ .
A point x∗ is called critical point of g − h if
∅ = ∂g(x∗ ) ∩ ∂h(x∗ ). (13)
DCA constructs two sequences {xi } and {y i } (candidates for being primal and
dual solutions, respectively), such that their corresponding limit points satisfy
the local optimality conditions (12) and (13). There are two forms of DCA: the
simplified DCA and the complete DCA. In practice, the simplified DCA is most
used than the complete DCA because it is less expensive [13]. The simplified
DCA has the following scheme [13,18]:
Simplified DCA Algorithm
Step 1: Let x0 ∈ IRn given. Set i = 0.
Step 2: Calculate y i ∈ ∂h(xi ).
Step 3: Calculate xi+1 ∈ ∂g ∗ (y i ).
Step 4: If a convergence criterion is satisfied, then stop, else set i = i + 1 and
goto step 2.
We also can note that: [15,18]
– DCA is a descent method without linesearch.
– If g(xi+1 ) − h(xi+1 ) = g(xi ) − h(xi ), then xi is a critical point of f and y i is
a critical point of h∗ − g ∗ .
– DCA has a linear convergence for general DC programs, and has a finite
convergence for polyhedral programs.
– If the optimal value of problem (8) is finite and the sequences {xi } and {y i }
are bounded then every limit point x (resp. y) of the sequence {xi } (resp.
{y i } is a critical point of g − h (resp. h∗ − g ∗ ).
5 DCA Applied to Problem (8)

⎛ ⎞

q
q
q
The function f (x) = min wk c̄k x − a ⎝ wk2 σk2 + 2 wk ws σks ⎠ in prob-
x 2
k=1 k=1 k,s=1
k<s
lem (8) will be decomposed in order to obtain a DC program of the form:
min{f (x) = g(x) − h(x) : x ∈ S}, (14)
with

q
g(x) = χS (x) + wk c̄tk x,
k=1
where χS (.) is the indicator function of the set S.

And ⎛ ⎞
a⎜
q
q
⎟
h(x) = ⎝ wk2 xt Vk x + 2 wk ws xt Vks x⎠ .
2 k,s=1
k=1
k<s
After that, we will compute the two sequences {xi } and {y i } defined as follows:
y i ∈ ∂h(xi )andxi+1 ∈ ∂g ∗ (y i ).
Computation of y i :
We choose y i ∈ ∂h(xi ) = ∇h(xi ) .
It is equivalent to calculate:
⎛ ⎞

q
q
yi = a ⎝ wk2 Vk xi + 2 wk ws Vks xi ⎠ . (15)
k=1 k,s=1
k<s
Computation of xi :
We can choose xi+1 ∈ ∂g ∗ (y i ) as the solution of the following convex problem
q

min wk c̄tk x − xt y i : x ∈ S . (16)
k=1
The solution xi is optimal for the problem (14) if one of the following conditions
is verified
|(g − h)(xi+1 ) − (g − h)(xi )| ≤ , (17)
(xi+1 ) − (xi ) ≤ . (18)
Finally, the DC Algorithm that we can apply to problem (8) with the decom-
position (14) can be described as follows:
Algorithm DCAMOSLP
Step 1: Initialization: Let x0 ∈ IRn , , k, w ∈ IR+ ,a > 0, V , A, b, c̄ given. Set

i = 0.
Step 2: Calculate y i ∈ ∂h(xi ) using (15).
Step 3: Calculate xi+1 ∈ ∂g ∗ (y i ), solution of the convex problem (16).
Step 4: If one of the conditions (17) or (18) is verified, then stop xi+1 is optimal
for (14), else set i = i + 1 and goto step 2.
To demonstrate the performances of our algorithm, two numerical examples
will be given in this section. The first is taken from [6] to show the efficiency
of the algorithm. The second example is given to present the performances of
DCAMOSLP according to the variation of certain parameters.
Let us consider the following stochastic bi-objective programming problem:
⎧
⎪
⎪
min (c̃11 x1 + c̃12 x2 , c̃21 x1 + c̃22 x2 ),
⎨ x
s.t. x1 + 2x2 ≥ 4, (19)
⎪
⎪ x1 , x2 ≤ 3,
⎩
x1 , x2 ≥ 0,
with c̃ = (c̃11 , c̃12 , c̃21 , c̃22 )t being a random vector multinormal with expected
value c = (0.5, 1, 1, 2.5)t and with positive definite covariance matrix:
⎛ ⎞
25 0 0 3
⎜ 0 25 3 0 ⎟
V =⎜⎝ 0 3 1 0⎠.
⎟
3 0 09
For this test, we will take = 10−6 and x0 = (0, 0) as initial point. The
application of algorithm DCAMOSLP to this problem for different values of the
coefficient of incurred risk a and a fixed weight vector μ = (0.8, 0.2)t gives the
results in Table 1 where nbr it is the number of iterations.
Table 1. Results for different values of parameter a.
a (x∗1 , x∗2 ) c1 x∗ c2 x∗ nbr it

−30
10 (3, 0.5) 2 4.25 2
−20
10 (3, 0.5) 2 4.25 2
10−10 (3, 0.5) 2 4.25 3
−2
10 (3, 0.5) 2 4.25 3
1 (3, 3) 4.5 10.5 5
10 (3, 3) 4.5 10.5 5
102 (3, 3) 4.5 10.5 5
The non dominated solution (3, 0.5) is obtained for values of parameter a ≤
10−2 . The non dominated solution for w = (0.8, 0.2)t in Ref. [6] is (3, 0.5).
We also note that the number of iterations decreases with the decrease of the
parameter a.
Now we will test the performance of the algorithm with a second problem
which has a larger set of feasible solutions.
⎧
⎪
⎪
min (c̃11 x1 + c̃12 x2 , c̃21 x1 + c̃22 x2 ),
⎨ x
s.t. 2x1 + 3x2 ≥ 10, (20)
⎪
⎪ x1 , x2 ≤ 5,
⎩
x1 , x2 ≥ 0,
with c = (6, −5, 3, 8)t and positive definite covariance matrix:

⎛ ⎞
14 0 0 3
⎜ 0 12 3 0 ⎟
V =⎜ ⎝ 0 3 2 0⎠.
⎟
3 0 08
The results of application of algorithm DCAMOSLP to this problem for different
values of parameter a and the weight vector w are given in Table 2.
Table 2. Results for different values of a and vector w.
a w (x∗1 , x∗2 ) nbr it

−20
10 (0.2, 0.8) (0.5524, 2.9651) 2
(0.8, 0.2) (0, 5) 2
(0.6, 0.4) (0, 3.3333) 2
(0.5, 0.5) (0, 3.3333) 2
(0.9, 0.1) (0, 5) 2
10−10 (0.2, 0.8) (0.5506, 2.9663) 5
(0.8, 0.2) (0, 5) 2
(0.6, 0.4) (0, 3.3333) 3
(0.5, 0.5) (0, 3.3333) 3
(0.9, 0.1) (0, 5) 2
10−2 (0.2, 0.8) (0, 3.3333) 5
(0.8, 0.2) (0, 5) 3
(0.6, 0.4) (0, 3.3333) 4
(0.5, 0.5) (0, 3.3333) 3
(0.9, 0.1) (0, 5) 3
10 (0.2, 0.8) (5, 5) 5
(0.8, 0.2) (5, 5) 4
(0.6, 0.4) (5, 5) 4
(0.5, 0.5) (5, 5) 4
(0.9, 0.1) (5, 5) 5
102 (0.2, 0.8) (5, 5) 4
(0.8, 0.2) (5, 5) 5
(0.6, 0.4) (5, 5) 5
(0.5, 0.5) (5, 5) 5
(0.9, 0.1) (5, 5) 5
We observe from the results that the algorithm DCAMOSLP gives efficient
solutions of the multiobjective stochastic problem for small values of the coeffi-
cient of incurred risk (a ≤ 10−2 ). The number of iterations decreases with the
decrease of the parameter a.
7 Conclusion
We have presented a DC optimization approach for solving a multiobjective
stochastic problem with multivariate normal distributions in which the objective
functions should be minimized. The experimental results show the efficiency of
the algorithm. However further experimental validation of this observation and
comparison with existing methods is needed. As future works, an algorithm for
a stochastic multiobjective maximization problem is planned.
References
1. Alarcon-Rodriguez, A., Ault, G., Galloway, S.: Multiobjective planning of dis-
tributed energy resources review of the state-of-the-art. Renew. Sustain. Energy
Rev. 14(5), 1353–1366 (2010)
2. Ben Abdelaziz, F., Mejri, S.: Application of goal programming in a multi-objective
reservoir operation model in Tunisia. Eur. J. Oper. Res. 133, 352–361 (2001)
3. Ben Abdelaziz, F., Lang, P., Nadeau, R.: Distributional unanimity in multiobjec-
tive stochastic linear programming. In: Clmaco, J. (ed.) Multicriteria Analysis.
4. Ben Abdelaziz, F.: L’efficacité en programmation multi-objectifs stochastique.
Ph.D. Thesis, Université de Laval, Québec (1992)
5. Bravo, M., Gonzalez, I.: Applying stochastic goal programming: a case study on
water use planning. Eur. J. Oper. Res. 2(196), 1123–1129 (2009)
6. Caballero, R., Cerdá, E., del Mar Muñoz, M., Rey, L.: Stochastic approach versus
multiobjective approach for obtaining efficient solutions in stochastic multiobjec-
tive programming problems. Eur. J. Oper. Res. 158(3), 633–648 (2004)
7. Caner, T.Z., Tamer, U.A.: Tactical level planning in float glass manufacturing
with co- production, random yields and substitutable products. Eur. J. Oper. Res.
199(1), 252–261 (2009)
8. Fazlollahtabar, H., Mahdavi, I.: Applying stochastic programming for optimizing
production time and cost in an automated manufacturing system. In: International
Conference on Computers & Industrial Engineering, Troyes, 6–9 July 2009, pp.
1226–1230 (2009)
9. Geoffrion, A.M.: Proper efficiency and the theory of vector maximimization. J.
Math. Anal. Appl. 22(3), 618–630 (1968)
10. Goicoechea, A., Dukstein, L., Bulfin, R.T.: Multiobjective Stochastic Program-
ming. The PROTRADE-Method. Operation Research Society of America (1976)
11. Klein, G., Moskowitz, H., Ravindran, A.: Interactive multiobjective optimization
under uncertainty. Manag. Sci. 36(1), 58–75 (1990)
12. Kumral, M.: Application of chance-constrained programming based on multiobjec-
tive simulated annealing to solve mineral blending problem. Eng. Optim. 35(6),
661–673 (2003)
13. Le Thi, H.A., Pham Dinh, T.: Solving a class of linearly constrained indefinite-
quadratic problems by DC algorithms. J. Glob. Optim. 11(3), 253–285 (1997b)
14. Le Thi, H.A., Pham Dinh, T.: A continuous approach for globally solving linearly
constrained quadratic zero-one programming problems. Optimization 50, 93–120
(2001)
15. Le Thi, H.A., Pham Dinh, T.: The DC (difference of convex functions) program-
ming and DCA revisited with DC models of real world nonconvex optimization
problems. Ann. Oper. Res. 133, 23–46 (2005)
16. Le Thi, H.A., Pham Dinh, T., Huynh, V.N.: Exact penalty and error bounds in
DC programming. J. Glob. Opt. 52, 509–535 (2012)
17. Le Thi, H.A., Pham Dinh, T., Nguyen, C.N., Nguyen, V.T.: DC programming
techniques for solving a class of nonlinear bilevel programs. J. Glob. Opt. 44,
313–337 (2009)
18. Pham Dinh, T., Le Thi, H.A.: Convex analysis approach to DC programming:
theory, algorithms and applications (dedicated to Professor Hoang Tuy on the
occasion of his 70th birthday). Acta Math. Vietnam. 22, 289–355 (1997a)
19. Pham Dinh, T., Nguyen, C.N., Le Thi, H.A.: DC programming and DCA for
globally solving the value-at-risk. Comput. Manag. Sci. 6, 477–501 (2009)
20. Teghem, J., Kunsch, P.: Application of multiobjective stochastic linear program-
ming to power systems planning. Eng. Costs Prod. Econ. 9(13), 83–89 (1985)
21. Teghem, J., Dufrane, D., Thauvoye, M., Kunsch, P.L.: Strange, an interactive
method for multiobjective stochastic linear programming under uncertainty. Eur.
J. Oper. Res. 26(1), 65–82 (1986)
22. Vahidinasab, V., Jadid, S.: Stochastic multiobjective self-scheduling of a power
producer in joint energy & reserves markets. Electr. Power Syst. Res. 80(7), 760–
769 (2010)
23. Wang, Z., Jia, X.P., Shi, L.: Optimization of multi-product batch plant design
under uncertainty with environmental considerations. Clean Technol. Environ. Pol-
icy 12(3), 273–282 (2009)
A DCA-Based Approach for Outage
Constrained Robust Secure
Power-Splitting SWIPT MISO System
Phuong Anh Nguyen(B) and Hoai An Le Thi
LGIPM, University of Lorraine, 57073 Metz, France

{phuong-anh.nguyen,hoai-an.le-thi}@univ-lorraine.fr
Abstract. This paper studies the worst-case secrecy rate maximiza-

tion problem under the total transmit power, the energy harvesting and
the outage probability requirements. The problem is nonconvex, thus,
hard to solve. Exploiting the special structure of the problem, we first
reformulate as a DC (Difference of Convex functions) program. Then,
we develop an efficient approach based on DCA (DC Algorithm) and
alternating method for solving the problem. The computational results
confirm the efficiency of the proposed approach.
Keywords: DC programming · DCA · Physical layer security ·

Beamforming · SWIPT · Outage probability
1 Introduction
Simultaneous wireless information and power transfer (SWIPT) are efficient to
mitigate energy scarcity [6]. However, the secrecy rate is degraded in SWIPT
system since Radio-Frequency signals carry not only information but also energy
[12]. Thus, security in SWIPT systems is a critical issue. Fortunately, physical
layer security (PLS) has proven to secure communication. The challenge to PLS
is that the transmitter needs to know the channel state information (CSI). In
practice, it is hard to gain perfect CSI. The works [2,3,19,21] have considered
robust secure transmission based on deterministic norm-bounded uncertainty
model for SWIPT systems. These works correspond to the active eavesdropping
scenario. In case of passive eavesdropping, the channel error cannot be gained
exactly.
In such case, the concept of outage probability is an effective approach for
secure SWIPT design. The problems in this field such as transmit power mini-
mization [5,7,20], secrecy rate maximization [1,13,14] are nonconvex. Thus, it
is hard to solve them. To overcome the nonconvexity, the semi-definite relax-
ation (SDR) technique was applied in [5,7,20]. However, the relaxed problems
only give an upper bound of the objective value and, hence, the performance is
decreased. Then, Chen et al. applied successive convex approximation (SCA) for
solving the outage constrained secrecy rate maximization problem in [1].
https://doi.org/10.1007/978-3-030-21803-4_30
290 P. A. Nguyen and H. A. L. Thi
It is well known that SCA is a special version of DCA, a powerful approach

in nonconvex programming framework (see e.g. [11]). There are as many DCA
as there are DC decompositions. Finding a suitable DC decomposition is the key
issue which plays a crucial role in determining the efficiency of resulting DCA
(the speed of convergence, robustness, globality of optimal solutions). Motivated
by this flexibility property of DCA, we apply DC programming and DCA to
solve the outage constrained problem in [1]. We first reformulate the problem as
a DC program. Since optimizing all variables at the same time is complex, we
develop an efficient approach based on DCA and alternating method to solve
the problem. Numerical results validate the efficiency of our method.
The rest of this paper is organized as follows. The system model is described
in Sect. 2. Section 3 shows how to apply the proposed method to solve the out-
age constrained problem. Section 4 provides numerical experiments and finally,
Sect. 5 concludes the paper.
Let us introduce some notations. Boldface upper-case and lower-case letters
denote matrices and vectors, respectively. (.)H and Tr (.) stand for the Hermitian
and trace of a matrix, respectively; I denotes the identity matrix; · is the
Frobenius norm; log(.) and ln(.) are taken to the base of 2 and e, respectively;
x ∼ CN (μ, Ω) stands for a complex Gaussian random vector with mean μ and
covariance Ω; (.), (.) denote the real and image part of a complex number.
2 System Model
In this section, we briefly describe the optimization problem reformulated in [1].
Consider a MISO SWIPT network with K transmit-receive pairs in presence
of one eavesdropper on each link. The transmitter is equipped with N antennas.
The channel vector from jth transmitter to kth legitimate user and eaves-
dropper are denoted by hjk ∈ CN and gjk ∈ CN , where gjk ∼ CN (0, Gjk ) and
hjk = hjk + Δhjk , ΔhH N
jk Qjk Δhjk ≤ 1, j, k ∈ K 1, ..., K. Let wk ∈ C be the
transmit beamforming vector and ρk ∈ (0, 1) be the power splitting factor for the
kth link. The additive Gaussian noise of signal for kth information receiver (IR),
kth energy receiver (ER) and kth eavesdropper are ni,k ∼ CN (0, σi,k2
), i = 1, 2, 3.
The considered problem is maximizing the minimal secrecy rate under fol-
lowing constraints:
K

The EH constraint : (1 − ρk ) |hH
jk wj | ≥ Ek , 0 < ρk < 1, j, k ∈ K. (1)
2
j=1
K

The power constraint : wj 2 ≤ Pmax . (2)
j=1
A DCA-Based Approach for Outage Constrained 291
The outage constraint : P r{Rsec,k ≥ Rk } ≥ 1 − pk , which is rewritten as [1]
(2RI,k −Rk − 1)σ3k

2 (2RI,k −Rk − 1) Tr(Gjk wj wH
j )
− ln pk − H
− ln 1 + H
≤ 0,
Tr(Gkk wk wk ) j=k
Tr(Gkk wk wk )
(3)
ρk |hH
kk wk |
2
RI,k = log 1 + , k ∈ K.
ρk j=k |hH
jk wj | + ρk σ1k
2 2 + σ2k
2
(4)
The problem is formulated as [1]
max min Rk , (5)

{ρk },{wk } k
s. t. (1)−(4). (6)
In the next section, we will investigate an approach based on DC programming

and DCA for solving the problem (5).
3 Solution Method Based on DC Programming and DCA

3.1 DC Programming and DCA
DC programming and DCA [10,11,15–17] constitute the backbone of
smooth/nonsmooth nonconvex programming and global optimization. They
address the problem of minimizing a function f which is a difference of lower
semicontinuous proper convex functions g, h on the whole space Rn , namely
α = inf{f (x) := g(x) − h(x) : x ∈ Rn }, (Pdc ).
Such a function f is called DC function, and g − h, DC decomposition of f

while g and h are DC components of f . The convex constraint x ∈ C can be
incorporated in the objective function of (Pdc ) by adding the indicator function
of C (χC (x) = 0 if x ∈ C, +∞ otherwise) to the first DC component g of the
DC objective function f .
The principle idea of DCA is quite simple: at the iteration k, DCA replaces
h by its affine minorization (taking y k ∈ ∂h(xk )) and solves the resulting convex
program

xk+1 ∈ arg minn {g(x) − h(xk ) − x − xk , y k }.
x∈R
The extension of DC programming and DCA was investigated for solving

general DC programs with DC constraints [8] as follows:
min f0 (x), (7)

x
s. t. fi (x) 0, ∀i = 1, ..., m, x ∈ C, (8)
where C ⊆ Rn is a nonempty closed convex set; fi : Rn → R, ∀i = 0, 1, .., m,

are DC functions, i.e., fi (x) = gi (x) − hi (x) with fi and hi being convex func-
tions. It is apparent that this class of nonconvex programs is the most general
in DC programming and as a consequence it is more challenging to deal with
than standard DC programs. Two approaches for general DC programs were
proposed in [10,16] to overcome the difficulty caused by the nonconvexity of the
constraints. Both approaches are built on the main idea of the philosophy of DC
programming and DCA, that is approximating (7) by a sequence of convex pro-
grams. The former was based on penalty techniques in DC programming while
the latter was relied on the convex inner approximation method. We use the idea
of the second approach to solve the considered problem.
General DCA scheme consists of linearizing the concave part of DC decom-
positions of all DC objective function and DC constraints by yik ∈ ∂hi (xk ),
∀i = 1, ..., m and computing xk+1 by solving the following convex problem

min g0 (x) − y0k , x ,
x

s. t. gi (x) − hi (xk ) − yik , x − xk 0, ∀i = 1, ..., m, x ∈ C.
3.2 DC Programming and DCA for Solving the Problem (1)

Let us define Wj = wj wH
j where Wj 0, Wj ∈ H
N ×N
, j ∈ K.
Proposition 1. The relaxation of the problem (5) is
max R (9)
{ρk },{Wk },{βk },{λjk },{fjk },{RI,k },{sk },{ak },{ξk },{Rk },R
K
Ek 1
s. t. Tr(Hjk Wj − jk Wj ) ≥ , βk ≥ , Rk ≥ R, k ∈ K, (10)
j=1
(1 − ρk ) ρk
K

Tr(Wj ) ≤ Pmax , 0 < u ≤ ak , v ≥ bk > 0, x ≥ RI,k , sk > 0, k ∈ K, (11)
j=1
K

fjk + σ1k
2
+ βk σ2k
2
≥ ak , fjk + σ1k
2
+ βk σ2k
2
≤ bk , k ∈ K, (12)
j=1 j=k

λkk Q + Wk Wk hkk
H H 0, λkk ≥ 0, k ∈ K, (13)
hkk Wk hkk Wk hkk − fkk − λkk

λjk Q − Wj −Wj hjk
H H 0, λjk ≥ 0, j = k, k ∈ K, (14)
−hjk Wj −hjk Wj hjk + fjk − λjk
Wk 0, Tr(Gkk Wk ) ≤ sk , k ∈ K, (15)

− ln pk − ξk σ3k
2
− ln(1 + ξk Tr(Gjk Wj ) ≤ 0, k ∈ K, (16)
j=k
Rk − RI,k + log(1 + ξk sk ) ≤ 0, k ∈ K, (17)

2x v − u ≤ 0. (18)
Proof. See Appendix A.
Denote X = ({ρk }, {Wk }, {βk }, {λjk }, {fjk }, {RI,k }, {sk }, {ak }, {Rk }, R).
The problem is difficult due to the nonconvexity of constraints (16)–(18).
Since it is hard to optimize all variables, a natural idea is to decouple into two
variable subsets (X and ξ) and then use the alternating optimization procedure.
First, we fix ξk = ξ t and compute X. The problem is nonconvex due to
constraints (17) and (18). We propose a DC decomposition for the constraint
(17) as F0 (zk ) = G0 (zk ) − H0 (zk ) ≤ 0, where zk = (Rk , RI,k , sk ), G0 (zk ) =
Rk − RI,k , H0 (zk ) = − log(1 + ξkt sk ). For the constraint (18): F1 (z) = G1 (z) −
1 1
H1 (z) ≤ 0, where z = (x, u, v), G1 (z) = ρ(x2 + u2 + v 2 ), H1 (z) = ρ(x2 +
2 2
u2 + v 2 ) + u − 2x v. It is not easy to estimate ρ such that H1 is convex, thus,
we investigate the DCA-Like based algorithm [9]. Then, we fix X = X t+1 and
compute ξk .
Our proposed algorithm is described in Algorithm 1.
Algorithm 1. The DCA based algorithm for solving the problem (5)
Initialization: Let ({ξk0 }, {s0k }, x0 , u0 , v 0 ) be an initial point, t ← 0.

repeat
1. Fix ξk = ξkt , compute X t+1 .
Initialization: Let ( stk , xt , ut , v t ) be an initial point, l ← t.
repeat
l l
1.1. Compute ∇H1 (zl ) = (ρxl − 2x v l ln 2, ρul + 1, ρv l − 2x ),
1.2. Compute z l+1 = (xl+1 , ul+1 , v l+1 ) by solving the problem
max R
X
s. t. (10) − (16),
ξkt (sk − slk )
R − RI,k + log(1 + ξkt slk ) + ≤ 0, k ∈ K,
(1 + ξkt slk ) ln 2

G1 (z) − H1 (zl ) − ∇H1 (zl ), z − zl ≤ 0.

1.3. while H1 (zl+1 ) < H1 (zl ) + ∇H1 (zl ), zl+1 − zl
ρ ← ηρ and update zl+1 by step 1.2.
1.4. l ← l + 1,
until stopping condition.
2. Fix X = X t+1 , compute ξkt+1 .
Initialization: Let ξkt be an initial point, l ← t.
2.1. Compute ξ l+1 by solving the problem
max Rt+1
{ξk }

s. t. − ln pk − ξk σ3k −
2
ln(1+ξk Tr(Gjk Wt+1
j ) ≤ 0, k ∈ K,
j=k
t+1 st+1 l
k (ξk − ξk )
Rt+1 − RI,k + log(1 + ξkl st+1
k )+ l t+1 ≤ 0, k ∈ K.
(1 + ξk sk ) ln 2
2.2. l ← l + 1,
3. t ← t + 1,
Remark 1. We can apply DCA to derive the beamforming matrix wj ∗ from W∗j .
Indeed, Wj = wj wH
j can be rewritten as the following problem (see [18])
min 0, (19)
A1 ,A2 ,wj
⎡ ⎤
A1 Wj wj
s. t. ⎣WH ⎦
j A2 wj 0, (20)
H H
wj wj 1
Tr(A1 − wj wH
j ) ≤ 0. (21)
The problem (19) is nonconvex due to the quadratic inequality constraint

(21). The constraint (21) can be formulated as a DC constraint whose two DC
components are G2 (A1 , wj ) = Tr(A1 ), H2 (A1 , wj ) = Tr(wj wH
j ).
The general DCA [8] for solving the problem (19) iteratively computes wk+1
j
by solving the problem
min 0,
A1 ,A2 ,w
s. t. (20),
−wkj 2 + 2(.(wkj ))T .(wj ) + 2(.(wkj ))T .(wj ) ≥ Tr(A1 ).
The numerical experiments aim to evaluate the performance of the proposed

algorithm by comparing with SCA.
Setup experiments:
All algorithms were implemented in Matlab R2016b, and executed on a PC
IntelCore
R TM
2 Quad (2.83 GHz) of 4GB RAM with the same datasets and
initial point. The CVX1 toolbox is used to solve the subproblems. The elements of
all the legitimate channel vectors are generated by complex Gaussian distribution
with zero mean and unit variance. For channel error models, Qjk = −2 I, Gjk =
γjk I, where γjk denotes the average channel correlation gain. We assume that
E = Ek , p = pk , σ 2 = σ1k
2
= σ2k
2
= σ3k2
, γ = γjk for all k, j. Unless otherwise
specified, we set by default K = 5, n = 4, P = 5, E = 0.001, p = 0.1, σ =
0.01, γ = 1, = 0.05.
Comments on numerical results:
We are interested in the effect of the coefficient error , channel correlation
gain and secrecy outage probabilities. For each value of these parameters, the
algorithms are tested on 10 independent channel realizations and the average
values of the secrecy rate (SSR-AVER) and CPU time in seconds (CPU(s)) are
reported.
In Fig. 1, we compare the SSR-AVER of DCA and SCA algorithm under
different coefficient errors. It can be seen that DCA is much more efficient than
SCA in terms of both SSR-AVER and CPU time. In terms of SSR-AVER, the
gain varies from 1.1295 to 2.2538. Concerning CPU time, the ratio of gain varies
from 3.9 to 4.4 times.
Figure 2 presents the SSR-AVER under different channel correlation gains.
DCA always provides the better solutions. Regarding SSR-AVER, the gain varies
from 1.9620 to 2.2722. Concerning CPU time, the ratio of gain is around 4.3
times.
Figure 3 shows the results of increasing secrecy outage probability. This figure
show again the performance superiority of DCA as compared to SCA in terms
of both SSR-AVER and CPU time. In terms of SSR-AVER, the gain varies from
2.102 to 2.2538. As for CPU time, the ratio of gain varies from 4.2 to 5.6 times.
In summary, the proposed DCA algorithm is more efficient than the existing
method in terms of the quality and the rapidity.
4 180
DCA DCA
SCA 160 SCA
3.5
140
3 120
SSR-AVER
100
CPU(s)
2.5
80
2 60
40
1.5
20
1 0
0.05 0.06 0.07 0.08 0.09 0.1 0.11 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0.11 0.12
Coefficient error Coefficient error
Fig. 1. Coefficient channel versus SSR-AVER and CPU(s)
1
Grant, M., Boyd, S.: CVX: Matlab software for disciplined convex programming,
version 2.0. Online: http://cvxr.com/cvx, (2012).
4 180
160
3.5
140
120
3
SSR-AVER
100
CPU(s)
80
2.5
60
40
2
20
1.5 0
0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 0.4 0.6 0.8 1 1.2 1.4 1.6
Channel correlation gain Channel correlation gain
Fig. 2. Channel correlation gain versus SSR-AVER and CPU(s)

4 180
160
3.5
140
3
120
SSR-AVER
2.5 100
CPU(s)
2 80
60
1.5
40
1
20
0.5 0
0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0.11 0 0.02 0.04 0.06 0.08 0.1 0.12
Secrecy outage probabilities secrecy outage probabilities
Fig. 3. Secrecy outage probabilities versus SSR-AVER and CPU(s)
5 Conclusions
In this paper, we considered the secrecy rate maximization problem under the
total transmit power, the energy harvesting, and the outage probability con-
straints. By studying the special structure of this problem, we reformulated the
problem as a DC program. To reduce the dimension of the problem, we proposed
an efficient approach based on DCA and alternating optimization for solving the
problem. Numerical experiments confirmed the efficiency of our proposed algo-
rithm when comparing with the existing method.
Appendix A
First, we transform the EH constraint. According to [4], we rewrite |hH
jk wj | =
2
H H H
wHj (Hjk + djk )wj , where Hjk = hjk hjk , djk = hjk Δhjk + Δhjk hjk +
H
Δhjk Δhjk . By applying the triangle inequality and the Cauchy-Schwarz inequal-
ity, we have
H
djk ≤ hjk ΔhH H
jk + Δhjk hjk + Δhjk Δhjk

djk ≤ Qjk −1 + 2hjk Qjk −1 = jk ⇒ − jk IN ≤ Djk ≤ jk IN ,

thus, Tr (Hjk − jk IN )Wj ≤ |hH jk wj | = Tr (Hjk + djk )Wj .
2
K Ek
Therefore, the EH constraint is recast as j=1 Tr(Hjk Wj − jk Wj ) ≥ (1−ρk ) .
ρk hH
kk Wk hkk
Next, RI,k ≤ log 1 + H
(22)
ρk j=k hjk Wj hjk + ρk σ1k + σ2k
2 2
is reformulated similarly to [1] as

⎧ 1 K
⎪
⎪ βk ≥ ρk ,
⎪ j=1 fjk + σ1k + βk σ2k ≥ ak , j=k fjk + σ1k + βk σ2k ≤ bk ,
2 2 2 2
⎪
⎪
⎪
⎪ 0 < u ≤ ak , v ≥ b
k > 0, λjk ≥ 0, x ≥ RI,k , 2x v − u ≤0,
⎪
⎪
⎨
λkk Q + Wk Wk hkk
⎪ hH
kk Wk hkk ≥ fkk ⇔ H H 0,
⎪
⎪ hkk Wk hkk Wk hkk − fkk − λkk
⎪
⎪

⎪
⎪
⎪hH W h ≤ f ⇔ λjk Q − Wj
⎪
−Wj hjk
0, j = k.
⎩ jk k jk jk H H
−hjk Wj −hjk Wj hjk + fjk − λjk
The relaxed constraints hold with equalities at the optimal solution [1].
RI,k −Rk
By using slack variables ξk = 2Tr(Gkk W−1 k)
, the outage constraint is trans-
formed into
⎧
⎪
⎨ − ln pk − ξk σ3k − j=k ln(1 + ξk Tr(Gjk Wj ) ≤ 0,
2
Tr(Gkk Wk ) ≤ sk , 0 < sk , 0 < ξk ,

⎪
⎩ξ ≤ 2RI,k −Rk −1 ⇔ R − R + log (1 + ξ s ) ≤ 0.
k sk k I,k 2 k k
RI,k −Rk
At the optimum, ξk = 2Tr(Gkk W−1 k)
at optimum, if not, we can increase Rk . In
addition, Tr(Gkk Wk ) = sk at the optimum, otherwise, we can decrease s leading
RI,k −Rk
to increase Rk due to ξk = 2Tr(Gkk W−1 k)
.
Such that, if the relaxed constraints do not hold with equalities at the opti-
mum, the objective function can be further increased.
References
1. Chen, D., He, Y., Lin, X., Zhao, R.: Both worst-case and chance-constrained robust
secure SWIPT in MISO interference channels. IEEE Trans. Inf. Forensics Secur.
13(2), 306–317 (2018)
2. Chu, Z., Zhu, Z., Hussein, J.: Robust optimization for an-aided transmission and
power splitting for secure MISO SWIPT system. IEEE Commun. Lett. 20(8),
1571–1574 (2016)
3. Feng, Y., Yang, Z., Zhu, W., Li, Q., Lv, B.: Robust cooperative secure beamforming
for simultaneous wireless information and power transfer in amplify-and-forward
relay networks. IEEE Trans. Veh. Technol. 66(3), 2354–2366 (2017)
4. Gharavol, E.A., Liang, Y., Mouthaan, K.: Robust downlink beamforming in mul-
tiuser MISO cognitive radio networks with imperfect channel-state information.
IEEE Trans. Veh. Technol. 59(6), 2852–2860 (2010)
5. Khandaker, M.R.A., Wong, K., Zhang, Y., Zheng, Z.: Probabilistically robust
SWIPT for secrecy misome systems. IEEE Trans. Inf. Forensics Secur. 12(1), 211–
226 (2017)
6. Krikidis, I., Timotheou, S., Nikolaou, S., Zheng, G., Ng, D.W.K., Schober, R.:
Simultaneous wireless information and power transfer in modern communication
systems. IEEE Commun. Mag. 52(11), 104–110 (2014)
7. Le, T.A., Vien, Q., Nguyen, H.X., Ng, D.W.K., Schober, R.: Robust chance-
constrained optimization for power-efficient and secure SWIPT systems. IEEE
Trans. Green Commun. Netw. 1(3), 333–346 (2017)
8. Le Thi, H.A., Huynh, V.N., Pham Dinh, T.: Dc programming and DCA for general
dc programs. In: Advanced Computational Methods for Knowledge Engineering,
pp. 15–35 (2014)
9. Le Thi, H.A., Le, H.M., Phanand, B.,Tran, D.N.: A DCA-like algorithm and
its accelerated version with application in data visualization (2018). CoRR
arXiv:abs/1806.09620
10. Le Thi, H.A., Pham Dinh, T.: The DC (difference of convex functions) program-
ming and DCA revisited with dc models of real world nonconvex optimization
problems. Ann. Oper. Res. 133(1), 23–46 (2005)
11. Le Thi, H.A., Pham Dinh, T.: DC programming and DCA: thirty years of devel-
opments. Math. Program. 169(1), 5–68 (2018)
12. Lei, H., Ansari, I.S., Pan, G., Alomair, B., Alouini, M.: Secrecy capacity analysis
over α − μ fading channels. IEEE Commun. Lett. 21(6), 1445–1448 (2017)
13. Li, Q., Ma, W.: Secrecy rate maximization of a MISO channelwith multiple multi-
antenna eavesdroppers via semidefinite programming. In: 2010 IEEE International
Conference on Acoustics, Speech and Signal Processing, pp. 3042–3045 (2010)
14. Ma, S., Hong, M., Song, E., Wang, X., Sun, D.: Outage constrained robust secure
transmission for MISO wiretap channels. IEEE Trans. Wirel. Commun. 13(10),
5558–5570 (2014)
15. Pham Dinh, T., Le Thi, H.A.: Convex analysis approach to D.C. programming:
theory, algorithm and applications. Acta Math. Vietnam. 22(1), 289–355 (1997)
16. Pham Dinh, T., Le Thi, H.A.: D.C. optimization algorithms for solving the trust
region subproblem. SIAM J. Optim. 8(2), 476–505 (1998)
17. Pham Dinh, T., Le Thi, H.A.: Recent advances in DC programming and DCA. In:
Transactions on Computational Intelligence XIII. pp. 1–37. Springer, Heidelberg
(2014)
18. Rashid, U., Tuan, H.D., Kha, H.H., Nguyen, H.H.: Joint optimization of source
precoding and relay beamforming in wireless mimo relay networks. IEEE Trans.
Commun. 62(2), 488–499 (2014)
19. Tian, M., Huang, X., Zhang, Q., Qin, J.: Robust an-aided secure transmission
scheme in MISO channels with simultaneous wireless information and power trans-
fer. IEEE Signal Process. Lett. 22(6), 723–727 (2015)
20. Wang, K., So, A.M., Chang, T., Ma, W., Chi, C.: Outage constrained robust trans-
mit optimization for multiuser MISO downlinks: tractable approximations by conic
optimization. IEEE Trans. Signal Process. 62(21), 5690–5705 (2014)
21. Wang, S., Wang, B.: Robust secure transmit design in mimo channels with simulta-
neous wireless information and power transfer. IEEE Signal Process. Lett. 22(11),
2147–2151 (2015)
DCA-Like, GA and MBO: A Novel
Hybrid Approach for Binary Quadratic
Programs
Sara Samir1(B) , Hoai An Le Thi1 , and Mohammed Yagouni2

1
Computer Science and Applications Department, LGIPM, University of Lorraine,
Metz, France
{sara.samir,hoai-an.le-thi}@univ-lorraine.fr
2
LaROMaD, USTHB, Alger, Algeria
myagouni@usthb.dz
Abstract. To solve problems of quadratic binary programming, we sug-

gest a hybrid approach based on the cooperation of a new version of
DCA (Difference of Convex functions Algorithm), named the DCA-Like,
a Genetic Algorithm and Migrating Bird Optimization algorithm. The
component algorithms start in a parallel way by adapting the Master-
Slave model. The best-found solution is distributed to all algorithms by
using the Message Passing Interface (MPI) library. At each cycle, the
obtained solution serves as a starting point for the next cycle’s compo-
nent algorithms. To evaluate the performance of our approach, we test on
a set of benchmarks of the quadratic assignment problem. The numerical
results clearly show the effectiveness of the cooperative approach.
Keywords: Binary quadratic programming problem ·

DC programming and DCA · Metaheuristics ·
Parallel and distributed programming · Genetic algorithm ·
Migrating bird optimization
1 Introduction
The binary quadratic programs (BQPs) are NP-hard combinatorial optimization
problems which take the following mathematical form:
⎧
⎪
⎪ min Z(x) = xT Qx + cT x,
⎨ Ax = b,
(BQP ) (1)
⎪
⎪ Bx ≤ b ,
⎩ n
x ∈ {0, 1} ,

where Q ∈ Rnn , c ∈ Rn , A ∈ Rmn , B ∈ Rpn , b ∈ Rm , b ∈ Rp .
BQP is a common model of several problems in different areas including
scheduling, facility location, assignment and knapsack.
https://doi.org/10.1007/978-3-030-21803-4_31
300 S. Samir et al.
Many exact methods have been developed to solve the BQP such as branch-
and-bound and cutting plane. The main limit of these methods is their expo-
nential execution time. Thus, they quickly become unusable for realistic size
instances. To effectively solve these problems, researchers have directed their
efforts towards the development of methods known as heuristics. In order to
save the time, the researchers have widely studied heuristics: e.g. genetic algo-
rithms, scatter search, ant colony optimization, tabu search, variable neighbor-
hood search, cuckoo Search, migrating bird optimization. The DC (Difference
of Convex function) programming and DCA (DC Algorithm) constitute another
research direction which has been successfully applied to BQP (see, [8,9,17]).
The main contribution of this study to the literature lies in a new cooperative
approach using DCA-like and Metaheuristic methods, named COP-DCAl -Meta,
for solving BQP. COP-DCAl -Meta is inspired by The Collaborative Metaheuris-
tic optimization Scheme proposed by Yagouni and Hoai An in 2014 [20]. COP-
DCAl -Meta consists in combining DCA-like (a new variant of DCA), a genetic
algorithm and the migrating bird optimization metaheuristic, in a cooperative
way. The participating algorithms start running in parallel. Then, the solution of
every algorithm will be distributed to the other ones via MPI (Message Passing
Interface). We opted for DCA-like due to its power for solving nonconvex pro-
grams in different areas. As for GA and MBO, their efficiency has been proved
in combinatorial optimization which motivated us to use them. To evaluate the
performance of COP-DCAl -Meta we test on instances of the well known the
quadratic assignment problem (QAP).
DC programming and DCA were introduced first in 1985 by Pham Dinh Tao
and then have been extensively developed since 1994 by Le Thi Hoai An and
Pham Dinh Tao. To understand DC programming and DCA, we refer the reader
to a seminal survey of Hoai An and Pham in [9]. DCA is a continuous approach
which showed its efficiency for solving combinatorial optimization problems [7,
18] by using the exact penalty techniques (see [10,11]).
Genetic algorithms are evolutionary algorithms proposed by Holland [3] and
inspired by the process of natural selection. In the literature, we can find a lot
of applications of genetic algorithms [4,13,14].
Migrating bird optimization (MBO) was presented by Duman et al. in 2012
[1]. MBO is heartened by the V formation flight of migrating birds. It has been
proved to be efficient in combinatorial optimization (see, e.g., [1,2,19])
This paper is organized as follows. After the introduction, the component
algorithms including DCA-like, GA, MBO and their application for solving BQP
are briefly presented in Sect. 2. Section 3 is devoted to the cooperative approach
for BQP. Numerical results are reported and discussed in Sects. 4 and 5 concludes
the paper.
2 DCA-Like, GA and MBO for Solving BQP

In this study, we use three algorithms to design our cooperative approach. First
of all, we briefly outline an overall description of DC programming and DCA,
GA, MBO as well as their application to the BQP.
DCA-Like, GA and MBO: A Novel Hybrid Approach 301
2.1 DC Programming and DCA

DC programming and DCA have been applied to many large-scale nonconvex
problem with impressive results in diverse areas. They adresse the DC program
whose the objectif function is DC on the whole space Rn or on a convex set
C ∈ Rn . A standard DC program is of the form
min {f (x) = g(x) − h(x) : x ∈ Rn } , (2)
where g, h ∈ Γ0 (Rn ), the set contains all lower semi-continuous proper convex
functions on Rn . g and h are called DC component, while g − h is a DC decom-
position of f . To avoid the ambiguity in DC programming, (+∞)− (+∞) = +∞
is a usual convention [16]. A constrained DC program is defined by
min {f (x) = g(x) − h(x) : x ∈ C} . (3)
When C is a nonempty closed convex set. It can be clearly seen that (3) is
a special case of (2). A constrained DC program can be transformed into an
unconstrained DC program by adding the indicator function χC of the set C
(χC (x) = 0 if x ∈ C, +∞ otherwise) to the first DC component g:
(3) ⇔ min {f (x) = χC (x) + g(x) − h(x) : x ∈ Rn } . (4)
DCA is an iterative algorithm based on local optimality. At each iteration, DCA

computes y k ∈ ∂h(xk ) and then solves the convex sub-problem

min{g(x) − h(xk ) + x − xk , y k : x ∈ Rn }. (5)
2.2 DC Reformulation of BQP

Many works have tackled DCA for solving BQP [6,7,12,18]. In order to apply DC
programming and DCA, we reformulate BQP as a concave quadratic program
with continuous variables using an exact penalty function [6,7,11,18].

Let Ω = {x ∈ [0, 1]n : Ax = b, Bx ≤ b }. BQP is equivalent to the following
formulation
min Z(x) : x ∈ Ω, p(x) ≤ 0 . (6)
m
where p(x) = Σi=1 xi (1 − xi ) is the penalty function.
We can define BQPt the penalized program of BQP as follows

min F (x) = Z(x) + tp(x) : x ∈ Ω. (7)
According to the theorem of exact penalty [10], there exists a number t0 > 0
such that for all t > t0 the problems (1) and (7) are equivalent in the sense that
they have the same optimal value and the same optimal solution set.
We can use the following DC components of F :
• Gρ (x) = ρ2 x2 + χΩ (x),
• Hρ (x) = ρ2 x2 − Z(x) − tp(x),
302 S. Samir et al.
with ρ > 0.
We can see that Gρ (x) is convex since Ω(x) is convex. As for Hρ (x), its
convexity depends on the value of ρ. In practice, it is hard to determine the
best value of ρ and is estimated by a
large value.
With ρ beeing larger than the
spectral radius

2 of ∇2
Z(x) denoted ρ ∇ 2
Z(x) , Hρ (x) is convex.
Since ρ ∇ Z(x) ≤ ∇ Z(x), we can choose ρ = ∇2 Z(x), where
2

2
∇ Z(x) =
2
(aip djq + api dqj ) . (8)
i,j,p,q
However, ρ = ∇2 Z(x) is quite large which can affect the convergence of
DCA. Hence, we can update ρ as in the following DCA-like algorithm [5].
Algorithm 1 DCA-Like for solving (7)

Initialization: Choose x0 , ζ ≥ 1, ρ0 = ρζ , η1 > 1, η2 > 1, t, tmax , ξ ≥ 1 and k = 0.
repeat
k
1. Compute ρ = max{ρ0 , ρη1 }.
2. Compute y k = ∇Hρ (xk ) by
y k = ρxk − ∇Z(xk ) − te + 2txk , (9)

where e is the one vector, i.e., all elements are 1.
3. Compute x̃ ∈ argmin { ρ2 x2 − y k , x : x ∈ Ω}.
4. While Hρ (x̃) < Hρk (xk ) + x̃ −xk , y k do
4.1 ρ = η2 ρ.
4.2 If (t < tmax ), t = ξt.
4.3 Update y k and x̃ by Steps 2. and 3.
End.
5. Set k = k + 1.
6. Set xk =x̃.
7. Set ρk = ρ.
until Stopping criterion is satisfied.
2.3 Genetic Algorithms and Its Application to BQP

The GAs proposed by Holland in 1975 [3] are adaptative stochastic evolutionary
algorithms, inspired by the biological model of the natural evolution of species.
The population of the GA is composed of chromosomes which are the genetic
representation of the solutions to our problem. The quality of each chromosome
is evaluated by using the fitness function. The GA is based mainly on some
operators. The operators used usually in GA are:
Selection: A subset of the population is selected to be mated, using one of
the selection procedures including, the tournament selection, the roulette wheel
selection, rank selection, random selection, etc.
Crossover: It is applied to the parents in order to create new individuals

called children. A child takes its characteristics from both of its parents.
Mutation: It consists of randomly changing one gene or more of a chromo-
some. The aim of the mutation is to create new properties in the population.
Elitism: It allows to keep the best individuals for the next generation.
In algorithm 2, we give the pseudo code of GA.
Algorithm 2 Pseudo code of GA

1. initialization.
repeat
2. Selection.
3. Crossover operator.
4. Mutation operator.
5. Evaluate the fitness of new chromosomes.
6. Elitism and updating of the population.
until Stopping criterion.
2.4 Migrating Bird Optimization
Contemplating and studying the V formation flight of the migrating birds give
rise to design a new metaheuristic for solving the combinatorial optimization
problems. This metaheuristic, called migrating bird optimization (MBO), was
introduced by Duman et al. in [1]. It consists in exploiting the power saving in
the V shape flight to minimize (or maximize) an objective function. The MBO
is based on a population of birds and uses the neighboring search technique.
A bird in a combinatorial optimization problem represents a solution. To keep
the context of the V formation, one bird is considered a leader and the others
constitute the right and left lines. The MBO starts treating from the leader to
tails along the lines as it is shown in Algorithm 3.
3 COP-DCAl -Meta: The Cooperative Approach

COP-DCAl -Meta is inspired by the Metastorming approach [20] which is the
brainstorming of metaheuristics. The brainstorming was presented by Osborn
in 1948 [15]. Its main goal is to give birth to new ideas through a discussion
between a group of participants. In brainstorming, all ideas of all the participat-
ing members are well taken into consideration. The similarity between the real
brainstorming and the brainstorming of metaheuristics is investigated in [20].
The difference between COP-DCAl -Meta and Metastorming lies in the anima-
tor which has not been used in our work.
The principle of COP-DCAl -Meta is to create a storm between DCA-like,
GA and MBO. In COP-DCAl -Meta, the collaboration is reflected by using three
304 S. Samir et al.
Algorithm 3 MBO algorithm

Initialization: n, m, k, x.
1. Generate new population and choose the leader.
repeat
2. nbrT our = 0.
3. while(nbrT our < m) do
3.1. Generate k neighbors for the leader and improve it.
3.2. Share 2x best neighbors of the leader with the left and right lines.
3.3. For all birds on the right and left lines, do
3.3.1. Generate k − x neighbors.
3.3.2. Improve it by using the shared and the generated neighbors.
3.3.3. Share the x best neighbors with the next bird.
End.
4. nbrT our = nbrT our + 1.
End.
5. Change the leader.
until Stopping criterion.
algorithms in parallel and solving all the problem. The cooperation is shown by
exchanging and distributing information. Using the master-slave model, we can
describe COP-DCAl -Meta as follows.
The Parallel Initialization.
The master chooses an initial point for DCA-like. Then, it computes ρ using (8).
Both of slaves generate randomly the initial populations.
A Cycle (Parallel and Distributed).
The master runs one iteration of DCA-like. If a binary solution is found then
it distributes a message indicating the end of the actual cycle. Otherwise, the
order of continuing execution is broadcasted. The slave 1 (slave 2) performs one
iteration of GA (MBO) and receives the information to rerun or to go to the
next step. This step will be repeated until getting a binary solution by DCA-like
which means until the end of the cycle.
Parallel Exchanging and Evaluation.
At the end of a cycle, all component algorithms will exchange the value of the
objective function (SDCA, SGA, and SMBO) between each other. After that, an
evaluation is done to determine the algorithm getting the best solution. In this
step, every process broadcasts whether if its algorithms satisfy their stopping
criterion (Stop-DCA, Stop-GA, and Stop-MBO) or not.
If all criteria are met, then COP-DCAl -Meta will end and the final solution
is BFS∗ . Otherwise, the best process distributes its solution (BFS) to the others
which will use it as an initial solution by DCA-like, as a new chromosome by
GA or as a new bird by MBO. At this level, COP-DCAl -Meta starts a new cycle
(Fig. 1).
Fig. 1. Cooperative-DCA-like-Metaheuristics: cooperative scheme based on meta-

heuristics and DCA (InfoDCA−like : SDCA and Stop-DCA, InfoGA : SGA and Stop-GA,
InfoM BO : SMBO and Stop-MBO).
4 Numerical Results
In this section, we report the computational results of the cooperation COP-

DCAl -Meta. The proposed approach has been coded in C++ program and com-
piled within Microsoft Visual Studio 2017 using the software CPLEX version 12.6
306 S. Samir et al.
as a convex quadratic solver. All experiments were carried out on a Dell desk-
top computer with Intel Core(TM) i5-6600 CPU 3.30GHz and 8GB RAM. To
study the performance of our approach, We test on seven instances of quadratic
assignment problem taken from QAPLIB (A Quadratic Assignment Problem
Library1 ) in OR-Library.2 We provide a comparison between COP-DCAl -Meta,
the participating algorithms and the best known lower bound (BKLB which is
either the optimal solution or the best-known one) for some instances. The com-
parison takes into consideration the objective value, the gap (see Eq. (10)) and
the running time measured in seconds.
Comparison Between COP-DCAl -Meta, the Component Algorithms

and the Best Known Lower Bound
The experiments are performed on four algorithms: COP-DCAl -Meta, DCA-

like, GA and MBO. The quality of each algorithm is evaluated by a comparison
with the best known lower bound (BKLB) and the gap between the obtained
objective value and the BKLB. The gap of an algorithm (A) is computed by the
following formulation
|Objective value given by (A) − BKLB|

Gap = ∗ 100%. (10)
Objective value given by (A)
The results of four algorithms are reported in Table 1. The first column gives the
ID of each dataset, the size which varies from 12 to 90 and the BKLB was taken
from OR-Library. The remaining columns show the objective value, the gap and
the CPU obtained by each algorithm. From the numerical results, it can be seen
that:
– In terms of the objective value and the gap, the cooperative approach COP-
DCAl -Meta is the most efficient, then MBO, DCA-like and finally GA.
– COP-DCAl -Meta obtained the BKLB on 6/7 instances.
– The cooperation between the component algorithms allows them to change
the behavior and browse more promising regions to get better results.
– DCA-like is very efficient when it solves the problems of big-size. This advan-
tage can be exploited by COP-DCAl -Meta to improve it.
– Regarding the running time, COP-DCAl -Meta consumes more time than each
component algorithm. The difference is due to communication between the
members.
1
http://anjos.mgi.polymtl.ca/qaplib//inst.html.
2
http://people.brunel.ac.uk/mastjjb/jeb/info.html.
Table 1. Comparison between COP-DCAl -Meta, DCA-like, GA and MBO.
Dataset Algorithm Objective value gap% CPU

Bur26a n = 26 COP-DCAl -Meta 5426670 0.00 1610,41
DCA-like 5438650 0.22 128.11
BKLB = 5426670 GA 5435200 0.16 403.77
MBO 5431680 0.09 241.69
Chr12a n = 12 COP-DCAl -Meta 9552 0 112.43
DCA-like 11418 16.34 17.65
BKLB = 9552 GA 10192 6.28 63.39
MBO 9552 0 1.55
Chr25a n = 25 COP-DCAl -Meta 3796 0 835.62
DCA-like 5200 27 62.66
BKLB = 3796 GA 4662 7.98 2050.78
MBO 4204 2.32 618.38
Nug12 n = 12 COP-DCAl -Meta 578 0 167.31
DCA-like 594 2.69 8.65
BKLB = 578 GA 586 1.37 214.655
MBO 578 0 12.0997
Esc32d n = 32 COP-DCAl -Meta 200 0 368.00
DCA-like 210 4.76 73.90
BKLB = 200 GA 200 0 59.8395
MBO 200 0 135.655
Tai35b n = 35 COP-DCAl -Meta 283315445 0 3208,83
DCA-like 283315445 0 295.11
BKLB = 283315445 GA 287271000 1.38 151.21
MBO 283335000 0.01 666.65
Lipa90 n = 90 COP-DCAl -Meta 362970 0.64 16027.00
DCA-like 360630 0.61 9797.05
BKLB = 360630 GA 363276 0.73 3533.52
MBO 362953 0.64 16618.7
5 Conclusion
We have investigated a new cooperative approach combining the determin-

istic algorithm DCA-like and two metaheuristics, GA and MBO, for solving
the BQPs. This approach COP-DCAl -Meta is inspired by the metastorming.
We gave first a brief explanation of the component algorithms of COP-DCAl -
Meta and then, detailed information about the cooperation. COP-DCAl -Meta
is applied for solving the quadratic assignment problem. In the experiments,
COP-DCAl -Meta outperformed the three component algorithms. This, if any-
308 S. Samir et al.
thing, shows that the cooperation between the component algorithms has been
successfully realized. As an area of further research, we can collaborate DCA
with other algorithms to solve other nonconvex problems.
References
1. Duman, E., Uysal, M., Alkaya, A.F.: Migrating birds optimization: a new meta-
heuristic approach and its performance on quadratic assignment problem. Inf. Sci.
217, 65–77 (2012)
2. Duman, E., Elikucuk, I.: Solving credit card fraud detection problem by the new
metaheuristics migrating birds optimization. In: Proceedings of the 12th Inter-
national Conference on Artificial Neural Networks. Advences in Computational
Intelligence, vol. II, pp. 62–71. Springer (2013)
3. Holland, J.H.: Adaptation in Natural and Artificial Systems. University of Michi-
gan Press (1975)
4. Julstrom, B.A.: Greedy, genetic, and greedy genetic algorithms for the quadratic
knapsack problem. In: Proceedings of the 7th Annual Conference on Genetic and
Evolutionary Computation, pp. 607–614. ACM (2005)
5. Hoai An, L.T., Le, H.M., Phan, D.N., Tran, B.: A DCA-like algorithm and its
accelerated version with application in data visualization. https://arxiv.org/abs/
1806.09620 (2018)
6. Hoai An, L.T., Pham, D.T.: Solving a class of linearly constrained indefinite
quadratic problems by DC algorithms. J. Glob. Optim. 11(3), 253–285 (1997)
7. Hoai An, L.T., Pham, D.T.: A continuous approch for globally solving linearly
constrained quadratic. Optimization 50(1–2), 93–120 (2001)
8. Hoai An, L.T., Pham D.T.: A continuous approach for large-scale constrained
quadratic zero-one programming. Optimization 45(3): 1–28 (2001). (In honor of
Professor ELSTER, Founder of the Journal Optimization)
9. Hoai An, L.T., Pham, D.T.: DC programming and DCA: thirty years of develop-
ments. Math. Program. 169(1), 5–68 (2018)
10. Hoai An, L.T., Pham, D.T., Le, D.M.: Exact penalty in DC programming. Viet-
nam. J. Math. 27(2), 169–178 (1999)
11. Hoai An, L.T., Pham, D.T., Van Ngai, H.: Exact penalty and error bounds in DC
programming. J. Glob. Optim. 52(3), 509–535 (2011)
12. Hoai An, L.T., Pham, D.T., Yen, N.D.: Properties of two DC algorithms in
quadratic programming. J. Glob. Optim. 49(3), 481–495 (2011)
13. Merz, P., Freisleben, B.: Genetic algorithms for binary quadratic programming. In:
Proceedings of the 1st Annual Conference on Genetic and Evolutionary Computa-
tion, vol. 1, pp. 417–424. Morgan Kaufmann Publishers Inc. (1999)
14. Misevicius, A., Staneviciene, E.: A new hybrid genetic algorithm for the grey pat-
tern quadratic assignment problem. Inf. Technol. Control. 47(3), 503–520 (2018)
15. Osborn, A.F.: Your creative power: how to use imagination to brighten life, to get
ahead. How To Organize a Squad To Create Ideas, pp. 265–274. Charles Scribner’s
Sons, New York (1948). ch. XXXIII.
16. Pham, D.T., Hoai An, L.T.: Convex analysis approach to DC programming: theory,
algorithm and applications. Acta Mathematica Vietnamica, 22(1), 289–355 (1997)
17. Pham, D.T., Hoai An, L.T., Akoa, F.: Combining DCA (DC Algorithms) and
interior point techniques for large-scale nonconvex quadratic programming. Optim.
Methods Softw.23, 609–629 (2008)
18. Pham, D.T., Canh, N.N., Hoai An, L.T.: An efficient combined DCA and B&B
using DC/SDP relaxation for globally solving binary quadratic programs. J. Glob.
Optim. 48(4), 595–632 (2010)
19. Tongur, V., Ülker, E.: Migrating birds optimization for flow shop sequencing prob-
lem. J. Comput. Commun. 02, 142–147 (2014)
20. Yagouni, M., Hoai An, L.T.: A collaborative metaheuristic optimization scheme:
methodological issues. In: van Do, T., Thi, H.A.L., Nguyen, N.T. (eds.) Advanced
Computational Methods for Knowledge Engineering. Advances in Intelligent Sys-
tems and Computing, vol. 282, pp. 3–14. Springer (2014)
Low-Rank Matrix Recovery with Ky Fan
2-k-Norm
Xuan Vinh Doan1,2(B) and Stephen Vavasis3

1
Operations Group, Warwick Business School, University of Warwick,
Coventry CV4 7AL, UK
Xuan.Doan@wbs.ac.uk
2
The Alan Turing Institute, British Library, 96 Euston Road,
London NW1 2DB, UK
3
Combinatorics and Optimization, University of Waterloo,
200 University Avenue West, Waterloo, ON N2L 3G1, Canada
vavasis@uwaterloo.ca
Abstract. We propose Ky Fan 2-k-norm-based models for the non-

convex low-rank matrix recovery problem. A general difference of convex
algorithm (DCA) is developed to solve these models. Numerical results
show that the proposed models achieve high recoverability rates.
Keywords: Rank minimization · Ky Fan 2-k-norm · Matrix recovery
1 Introduction
Matrix recovery problem concerns the construction of a matrix from incomplete
information of its entries. This problem has a wide range of applications such
as recommendation systems with incomplete information of users’ ratings or
sensor localization problem with partially observed distance matrices (see, e.g.,
[3]). In these applications, the matrix is usually known to be (approximately)
low-rank. Finding these low-rank matrices are theoretically difficult due to their
non-convex properties. Computationally, it is important to study the tractability
of these problems given the large scale of datasets considered in practical appli-
cations. Recht et al. [11] studied the low-rank matrix recovery problem using a
convex relaxation approach which is tractable. More precisely, in order to recover
a low-rank matrix X ∈ Rm×n which satisfy A(X) = b, where the linear map
A : Rm×n → Rp and b ∈ Rp , b = 0, are given, the following convex optimization
problem is proposed:
min X∗
X (1)
s.t. A(X) = b,

where X∗ = σi (X) is the nuclear norm, the sum of all singular values of
i
X. Recht et al. [11] showed the recoverability of this convex approach using some
This work is partially supported by the Alan Turing Fellowship of the first author.
https://doi.org/10.1007/978-3-030-21803-4_32
Low-Rank Matrix Recovery with Ky Fan 2-k-Norm 311
restricted isometry conditions of the linear map A. In general, these restricted

isometry conditions are not satisfied and the proposed convex relaxation can fail
to recover the matrix X.
Low-rank matrices appear to be appropriate representations of data in other
applications such as biclustering of gene expression data. Doan and Vavasis [5]
proposed a convex approach to recover low-rank clusters using dual Ky Fan
2-k-norm instead of the nuclear norm. Ky Fan 2-k-norm is defined as
k 12

|A|k,2 = σi2 (A) , (2)
i=1
where σ1 ≥ . . . σk ≥ 0 are the first k largest singular values of A, k ≤ k0 =

rank(A). The dual norm of the Ky Fan 2-k-norm is denoted by | · |k,2 ,
|A|k,2 = max A, X

X (3)
s.t. |X|k,2 ≤ 1.
These unitarily invariant norms (see, e.g., Bhatia [2]) and their gauge functions
have been used in sparse prediction problems [1], low-rank regression analysis
[6] and multi-task learning regularization [7]. When k = 1, the Ky Fan 2-k-norm
is the spectral norm, A = σ1 (A), the largest singular value of A, whose dual
norm is the nuclear norm. Similar to the nuclear norm, the dual Ky Fan 2-k-
norm with k > 1 can be used to compute the k-approximation of a matrix A
(Proposition 2.9, [5]), which demonstrates its low-rank property. Motivated by
this low-rank property of the (dual) Ky Fan 2-k-norm, which is more general than
that of the nuclear norm, and its usage in other applications, in this paper, we
propose a Ky Fan 2-k-norm-based non-convex approach for the matrix recovery
problem which aims to recover matrices which are not recoverable by the convex
relaxation formulation (1). In Sect. 2, we discuss the proposed models in detail
and in Sect. 3, we develop numerical algorithms to solve those models. Some
numerical results will also be presented.
2 Ky Fan 2-k-Norm-Based Models

The Ky Fan 2-k-norm is the 2 -norm of the vector of k largest singular values
with k ≤ min{m, n}. Thus we have:
k 12 ⎛ ⎞ 12

min{m,n}
|A|k,2 = σi2 (A) ≤ AF = ⎝ σi2 (A)⎠ ,
i=1 i=1
where · is the Frobenius norm. Now consider the dual Ky Fan 2-k-norm and
use the definition of the dual norm, we obtain the following inequality:
AF = A, A ≤ |A|k,2 · |A|k,2 .

2
312 X. V. Doan and S. Vavasis
Thus we have:
|A|k,2 ≤ AF ≤ |A|k,2 , k ≤ min{m, n}. (4)
It is clear that these inequalities become equalities if and only if rank(A) ≤ k. It

shows that to find a low-rank matrix X that satisfies A(X) = b with rank(X) ≤
k, we can solve either the following optimization problem
|X|k,2
min
XF (5)
s.t. A(X) = b,
or
min |X|k,2 − XF
(6)
s.t. A(X) = b.
It is straightforward to see that these non-convex optimization problems can be
used to recover low-rank matrices as stated in the following theorem given the
norm inequalities in (4).
Theorem 1. If there exists a matrix X ∈ Rm×n such that rank(X) ≤ k and

A(X) = b, then X is an optimal solution of (5) and (6).
Given the result in Theorem 1, the exact recovery of a low-rank matrix using
(5) or (6) relies on the uniqueness of the low-rank solution of A(X) = b. Recht
et al. [11] generalized the restricted isometry property of vectors introduced by
Candès and Tao [4] to matrices and use it to provide sufficient conditions on the
uniqueness of these solutions.
Definition 1 (Recht et al. [11]). For every integer k with 1 ≤ k ≤ min{m, n},
the k-restricted isometry constant is defined as the smallest number δk (A) such
that
(1 − δk (A)) XF ≤ A(X)2 ≤ (1 + δk (A)) XF (7)
holds for all matrices X of rank at most k.
Using Theorem 3.2 in Recht et al. [11], we can obtain the following exact recovery
result for (5) and (6).
Theorem 2. Suppose that δ2k < 1 and there exists a matrix X ∈ Rm×n which
satisfies A(X) = b and rank(X) ≤ k, then X is the unique solution to (5) and
(6), which implies exact recoverability.
The condition in Theorem 2 is indeed better than those obtained for the
nuclear norm approach (see, e.g., Theorem 3.3 in Recht et al. [11]). The non-
convex optimization problems (5) and (6) use norm ratio and difference. When
k = 1, the norm ratio and difference are computed between the nuclear and
Frobenius norm. The idea of using these norm ratios and differences with k = 1
has been used to generate non-convex sparse generalizer in the vector case, i.e.,
m = 1. Yin et al. [13] investigated the ratio 1 /2 while Yin et al. [14] analyzed
the difference 1 − 2 in compressed sensing. Note that even though optimization

formulations based on these norm ratios and differences are non-convex, they
are still relaxations of 0 -norm minimization problem unless the sparsity level
of the optimal solution is s = 1. Our proposed approach is similar to the idea
of the truncated difference of the nuclear norm and Frobenius norm discussed
in Ma et al. [8]. Given a parameter t ≥ 0, the truncated difference is defined
⎛ ⎞ 12

min{m,n}

min{m,n}
as A∗,t−F = σi (A) − ⎝ σi2 (A)⎠ ≥ 0. For t ≥ k − 1, the
i=t+1 i=t+1
problem of truncated difference minimization can be used to recover matrices
with rank at most k given that X∗,t−F = 0 if rank(X) ≤ t + 1. Similar results
for exact recovery as in Theorem 2 are provided in Theorem 3.7(a) in Ma et al.
[8]. Despite the similarity with respect to the recovery results, the problems (5)
and (6) are motivated from a different perspective. We are now going to discuss
how to solve these problems next.
3 Numerical Algorithm
3.1 Difference of Convex Algorithms
We start with the problem (5). It can be reformulated as
2
max ZF
Z ,z
s.t. |Z|k,2 ≤ 1, (8)
A(Z) − z · b = 0,
z > 0,
with the change of variables z = 1/|X|k,2 and Z = X/|X|k,2 . The compact
formulation
2
min δZ (Z, z) − ZF /2, (9)
Z ,z
where Z is the feasible set of the problem (8) and δZ (·) is the indicator function
of Z. The problem (9) is a difference of convex (d.c.) optimization problem (see,
e.g., [9]). The difference of convex algorithm DCA proposed in [9] can be applied
to the problem (9) as follows.
Step 1. Start with (Z 0 , z 0 ) = (X 0 /|X 0 |k,2 , 1/|X 0 |k,2 ) for some X 0
such that A(X 0 ) = b and set s = 0.
Step 2. Update (Z s+1 , z s+1 ) as an optimal solution of the following convex
optimization problem
max Z s , Z
Z ,z
s.t. |Z|k,2 ≤ 1 (10)
A(Z) − z · b = 0
z > 0.
Step 3. Set s ← s + 1 and repeat Step 2.
Let X s = Z s /z s and use the general convergence analysis of DCA (see, e.g.,
Theorem 3.7 in [10]), we can obtain the following convergence results.
Proposition 1. Given the sequence {X s } obtained from the DCA algorithm for
the problem (9), the following statements are true.

|X s |k,2
(i) The sequence is non-increasing and convergent.
X s F
X s+1 Xs

(ii) − → 0 when s → ∞.
|X s+1 |k,2 |X s |k,2
F
The convergence results show that the DCA algorithm improves the objec-
tive of the ratio minimization problem (5). The DCA algorithm can stop if
(Z s , z s ) ∈ O(Z s ), where O(Z s ) is the set of optimal solution of 10 and (Z s , z s )
which satisfied this condition is called a critical point. Note that (local) opti-
mal solutions of (9)can be shown to be critical points. The following proposition
shows that an equivalent condition for critical points.
Proposition 2. (Z s , z s ) ∈ O(Z s ) if and only if Y = 0 is an optimal solution

of the following optimization problem
|X s |k,2
min |X s + Y |k,2 − · X s , Y
X s F
Y 2 (11)
s.t. A(Y ) = 0.
Proof. Consider Y ∈ Null(A), i.e., A(Y ) = 0, we then have:

Xs + Y 1
, ∈ Z.
|X s + Y |k,2 |X s + Y |k,2
Xs Xs + Y Xs Xs
Clearly, s , s ≤ , is equivalent to
|X |k,2 |X + Y |k,2 |X |k,2 |X s |k,2
s
|X s |k,2
|X s + Y |k,2 − · X s , Y ≥ |X s |k,2 .
X s F
2
When Y = 0, we achieve the equality. We have: (Z s , z s ) ∈ O(Z s ) if and only the

above inequality holds for all Y ∈ Null(A), which means f (Y ; X s ) ≥ f (0; X s )
|X|k,2
for all Y ∈ Null(A), where f (Y ; X) = |X + Y |k,2 − 2 · X, Y . Clearly,
XF
it is equivalent to the fact that Y = 0 is an optimal solution of (11).
The result of Proposition 2 shows the similarity between the norm ratio min-
imization problem (5) and the norm different minimization problem (6) with
respect to the implementation of the DCA algorithm. It is indeed that the prob-
lem (6) is a d.c. optimization problem and the DCA algorithm can be applied
as follows.
Step 1. Start with some X 0 such that A(X 0 ) = b and set s = 0.

Step 2. Update X s+1 = X s + Y , where Y is an optimal solution of the
following convex optimization problem
1
min |X s + Y |k,2 − · X s , Y
Y X s F (12)
s.t. A(Y ) = 0.
Step 3. Set s ← s + 1 and repeat Step 2.

It is clear that X s is a critical point for the problem (6) if and only if Y is
an optimal solution of (12). Both problems (10) and (12) can be written in the
general form as
min |X s + Y |k,2 − α(X s ) · X s , Y
Y (13)
s.t. A(Y ) = 0,
|X s |k,2 1
where α(X) = for (10) and α(X) = for (12), respectively.
X s F
2 X s F
s
Given that A(X ) = b, this problem can be written as
min |X|k,2 − α(X s ) · X s , X

X (14)
s.t. A(X) = b.
The following proposition shows that X s is a critical point of the problem (14)
for many functions α(·) if rank(X s ) ≤ k.
Proposition 3. If rank(X s ) ≤ k, X s is a critical point of the problem (14) for

any function α(·) which satisfies
1 |X s |k,2
≤ α(X) ≤ 2 . (15)
XF X s F
Proof. If rank(X s ) ≤ k, we have: α(X s ) = 1/|X s |k,2 since |X s |k,2 =

XF = |X s |k,2 . Given that
∂|A|k,2 = arg max X, A ,

X :|X |k,2 ≤1
we have: α(X s )·X s ∈ ∂|X s |k,2 . Thus for all Y , the following inequality holds:
|X s + Y |k,2 − |X s |k,2 ≥ α(X s ) · X s , Y .
It implies Y = 0 is an optimal solution of the problem (13) since the optimality

condition is
|X s + Y |k,2 − |X s |k,2 ≥ α(X s ) · X s , Y , ∀ Y : A(Y ) = 0.
Thus X s is a critical point of the problem (14).

Proposition 3 shows that one can choose different functions α(·) such as α(X) =
1/|X|k,2 for the sub-problem in the general DCA framework to solve the orig-
inal problem. This generalized sub-problem (14) is a convex optimization prob-
lem, which can be formulated as a semidefinite optimization problem given the
following calculation of the dual Ky Fan 2-k-norm provided in [5]:
|X|k,2 = min p + trace(R)

s.t. kp − trace(P ) = 0,
− P 10, T
pI (16)
P −2X
0.
− 12 X R
In order to implement the DCA algorithm, one also needs to consider how to
find the initial solution X 0 . We can use the nuclear norm minimization problem
1, the convex relaxation of the rank minimization problem, to find X 0 . A similar
approach is to use the following dual Ky Fan 2-k-norm minimization problem to
find X 0 given its low-rank properties:
min |X|k,2
X (17)
s.t. A(X) = b.
This initial problem can be considered as an instance of (14) with X s = 0 (and

α(0) = 1), which is equivalent to starting the iterative algorithm with X 0 = 0
one step ahead. We are now ready to provide some numerical results.
3.2 Numerical Results

Similar to Candès and Recht [3], we construct the following the experiment. We
generate M , an m × n matrix of rank r, by sampling two m × r and n × r factors
M L and M R with i.i.d. Gaussian entries and setting M = M L M R . The linear
map A is constructed with s independent Gaussian matrices Ai whose entries
follows N (0, 1/s), i.e.,
A(X) = b ⇔ Ai , X = Ai , M = bi , i = 1, . . . , s.
We generate K = 50 matrix M with m = 50, n = 40, and r = 2. The dimension

of these matrices is dr = r(m+n−r) = 176. For each M , we generate s matrices
for the random linear map with s ranging from 180 to 500. We set the maximum
number of iterations of the algorithm to be Nmax = 100. The instances are solved
using SDPT3 solver [12] for semi-definite optimization problems in Matlab. The
computer used for these numerical experiments is a 64-bit Windows 10 machine
with 3.70 GHz quad-core CPU, and 32GB RAM. The performance measure is the
X − M F
relative error and the threshold = 10−6 is chosen. We run three
M F
different algorithms, nuclear used the nuclear optimization formulation (1),
k2-nuclear used the proposed iterative algorithm with initial solution obtained
from (1), and k2-zero used the same algorithm with initial solution X 0 = 0.
Fig. 1. Recovery probabilities and average computation times of three algorithms
Figure 1 shows recovery probabilities and average computation times (in seconds)
for different sizes of the linear map.
The results show that the proposed algorithm can recover exactly the matrix
M with 100% rate when s ≥ 250 with both initial solutions while the nuclear
norm approach cannot recover any matrix at all, i.e., 0% rate, if s ≤ 300.
k2-nuclear is slightly better than k2-zero in terms of recoverability when s is
small while their average computational times are almost the same in all cases.
The efficiency of the proposed algorithm when s is small comes with higher aver-
age computational times as compared to that of the nuclear norm approach. For
example, when s = 180, on average, one needs 80 iterations to reach the solution
when the proposed algorithm is used instead of 1 with the nuclear norm opti-
mization approach. Note that the average number of iterations is computed for
all cases including cases when the matix M cannot be recovered. For recoverable
cases, the average number of iterations is much less. For example, when s = 180,
the average number of iterations for recoverable case is 40 instead of 80. When
the size of the linear map increases, the average number of iterations is decreased
significantly. We only need 2 extra iterations when s = 250 or 1 extra iteration
on average when s = 300 to obtain 100% recover rate when the nuclear norm
optimization approach still cannot recover any of the matrices (0% rate). These
results show that the proposed algorithm achieve significantly better recovery
rate with a small number of extra iterations in many cases. We also test the
algorithms with higher ranks including r = 5 and r = 10. Figure 2 shows the
results when the size of linear map is s = 1.05dr .
Fig. 2. Recovery probabilities and average computation times for different ranks
These results show that when the size of linear maps is small, the proposed
algorithms are significantly better than the nuclear norm optimization approach.
With s = 1.05dr , the recovery probability increases when r increases and it is
close to 1 when r = 10. The computational time increases when r increases given
that the size of the sub-problems depends on the size of the linear map. With
respect to the number of iterations, it remains low. When r = 10, the average
numbers of iterations are 22 and 26 for k2-nuclear and k2-zero, respectively. It
shows that k2-nuclear is slightly better than k2-zero both in terms of recovery
probability and computational time.
4 Conclusion
We have proposed non-convex models based on the dual Ky Fan 2-k-norm for
low-rank matrix recovery and developed a general DCA framework to solve the
models. The computational results are promising. Numerical experiments with
larger instances will be conducted with first-order algorithm development for the
proposed modes as a future research direction.
References
1. Argyriou, A., Foygel, R., Srebro, N.: Sparse prediction with the k-support norm.
In: NIPS, pp. 1466–1474 (2012)
2. Bhatia, R.: Matrix Analysis, Graduate Texts in Mathematics, vol. 169. Springer,
New York (1997)
3. Candès, E.J., Recht, B.: Exact matrix completion via convex optimization. Found.
Comput. Math. 9(6), 717–772 (2009)
4. Candès, E.J., Tao, T.: Decoding by linear programming. IEEE Trans. Inf. Theory
51(12), 4203–4215 (2005)
5. Doan, X.V., Vavasis, S.: Finding the largest low-rank clusters with Ky Fan 2-k-
norm and 1 -norm. SIAM J. Optim. 26(1), 274–312 (2016)
6. Giraud, C.: Low rank multivariate regression. Electron. J. Stat. 5, 775–799 (2011)
7. Jacob, L., Bach, F., Vert, J.P.: Clustered multi-task learning: a convex formulation.
NIPS 21, 745–752 (2009)
8. Ma, T.H., Lou, Y., Huang, T.Z.: Truncated 1−2 models for sparse recovery and
rank minimization. SIAM J. Imaging Sci. 10(3), 1346–1380 (2017)
9. Pham, D.T., Hoai An, L.T.: Convex analysis approach to dc programming: theory,
algorithms and applications. Acta Mathematica Vietnamica 22(1), 289–355 (1997)
10. Pham, D.T., Hoai An, L.T.: A dc optimization algorithm for solving the trust-
11. Recht, B., Fazel, M., Parrilo, P.: Guaranteed minimum-rank solutions of linear
matrix equations via nuclear norm minimization. SIAM Rev. 52(3), 471–501 (2010)
12. Toh, K.C., Todd, M.J., Tütüncü, R.H.: SDPT3-a MATLAB software package for
semidefinite programming, version 1.3. Optim. Methods Softw. 11(1–4), 545–581
(1999)
13. Yin, P., Esser, E., Xin, J.: Ratio and difference of 1 and 2 norms and sparse
representation with coherent dictionaries. Commun. Inf. Syst. 14(2), 87–109 (2014)
14. Yin, P., Lou, Y., He, Q., Xin, J.: Minimization of 1 − 2 for compressed sensing.
SIAM J. Sci. Comput. 37(1), A536–A563 (2015)
Online DCA for Times Series Forecasting
Using Artificial Neural Network
Viet Anh Nguyen(B) and Hoai An Le Thi

{viet-anh.nguyen,hoai-an.le-thi}@univ-lorraine.fr
Abstract. In this work, we study the online time series forecasting prob-
lem using artificial neural network. To solve this problem, different online
DCAs (Difference of Convex functions Algorithms) are investigated. We
also give comparison with online gradient descent—the online version of
one of the most popular optimization algorithm in the collection of neu-
ral network problems. Numerical experiments on some benchmark time
series datasets validate the efficiency of the proposed methods.
Keywords: Online DCA · DC programming · DCA · Time series

forecasting · Artificial neural network
1 Introduction
Time series analysis and forecasting have an important role and a wide range
of applications such as stock market, weather forecasting, energy demand, fuels
usage, electricity and in any domain with specific seasonal or trendy changes in
time [15]. The information one gets from forecasting time series data can con-
tribute to important decisions of companies or organizations with high priority.
The goal of time series analysis is to extract information of a given time series
data over some period of time. Then, the information is used to construct a
model, which could be used for predicting future values of the considering time
series.
Online learning is a technique in machine learning which is performed in a
sequence of consecutive rounds [14,17]. At each round t, we receive a question
xt and have to give a corresponding prediction pt (xt ). After that, we receive
the true answer yt and suffer the loss between pt (x) and yt . In many real world
situations, we do not know the entire time series beforehand. New data might
arrive sequentially in real time. In those cases, analysis and forecasting of time
series should be put in an online learning context.
Linear models like autoregressive and autoregressive moving average mod-
els are standard tools for time series forecasting problems [2]. However, many
processes in real world are nonlinear. Empirical experience shows that linear
models are not always the best to simulate the underlying dynamics of a time
series. This gives rise to a demand for better nonlinear models. Recently, artifi-
cial neural networks have shown promising results in different applications [9],
https://doi.org/10.1007/978-3-030-21803-4_33
Online DCA for Times Series Forecasting Using Artificial Neural Network 321
Algorithm 1. Online learning scheme

1: for t = 1, 2, ... do
2: Receive question xt ∈ X.
3: Predict p(xt | θ) ∈ Y .
4: Receive the true answer yt ∈ Y and suffer loss L (p(xt | θ), yt ).
5: end for
which comes from the flexibility of those models in approximating functions (see
[4,5]). In term of time series forecasting using neural networks, there are many
works to mention [1,11].
Although these works demonstrate the effectiveness of neural networks for
time series applications, they used the smooth activation functions such as sig-
moid or tanh. In recent works, ReLU activation function are shown to be better
than the above smooth activation functions with good properties in practical
[12]. In this work, we propose an online autoregressive model using neural net-
work with ReLU activation function. Unlike other regression works which use
square loss, we choose -insensitive loss function to reduce the impact of outliers.
We limit the architecture of the network to one hidden layer to reduce overfit-
ting problem. In spite of not being a deep network, fitting one hidden layer
neural network is still a nonconvex, nonsmooth optimization problem. To solve
such problem in online context, we utilize the tools of online DC (Difference of
Convex functions) programming and online DCA (DC Algorithm) (see [7,8,13]).
The contribution of this work is the proposal and comparison of several online
optimization algorithms based on online DCA to solve the time series forecasting
problem using neural network. Numerical experiments on different time series
datasets indicate the effectiveness of the proposed methods.
The structure of this work is organized as follows. In Sect. 2, we present the
online learning scheme, DCA and online DCA schemes with two learning rules.
In Sect. 3, we formulate the autoregressive model with neural network and the
corresponding online optimization problem. Section 4 contains three online DC
algorithms to solve the problem in the previous section. We also consider the
online gradient descent algorithm in this section. Numerical experiments are
presented in Sect. 5 with conclusion in Sect. 6.
2 Online Learning and Online DCA

2.1 Online Learning
In online learning, the online learner task is to answer a sequence of questions
given the knowledge of the previous ones and possibly additional available infor-
mation [14]. Online learning has interesting properties in both theoretical and
practical aspects and is one of the most important domain in machine learning.
The general scheme for online learning is summarized in Algorithm 1.
The process of online learning is performed in a sequence of consecutive
rounds. At round t, the learner is given a question xt , which is taken from an
322 V. A. Nguyen and H. A. Le Thi
instance domain X and is required to provide an answer p(xt | θ) in a target

space Y . The learner is determined by its parameter θ in parameter space S.
The learner then receive the true answer yt from the environment and suffers a
loss L (p(xt | θ), yt ), which is a measure for the quality of the learner.
In this work, at each round t, we use the assumption that the loss function
is a DC function.
2.2 DCA and Online DCA
DC Programming and DCA constitute the backbone of smooth/nonsmooth non-

convex programming and global optimization. They address the problem of min-
imizing a function f which is a difference of convex functions on the whole space
Rd . Generally speaking, a DC program takes the form

α = inf f (x) := g(x) − h(x) : x ∈ Rd (Pdc ),
where g, h ∈ Γ0 (Rd ), the set contains all lower semicontinuous proper convex
functions on Rd . Such a function f is called a DC function, and g − h, a DC
decomposition of f while g and h are DC components of f . A standard DC
program with a convex constraint C (a non empty closed convex set in Rd ),
which is α = inf {f (x) := g(x) − h(x) : x ∈ C}, can be expressed in the form of
(Pdc ) by adding the indicator function of C to the function g.
The main idea of DCA is quite simple: each iteration k of DCA approxi-
mates the concave part −h by its affine majorization corresponding to taking
the subgradient y ∈ ∂h(xk ) and minimizes the resulting convex function.
Convergence properties of the DCA and its theoretical basis are described
in [7,13]. In the past years, DCA has been successfully applied in several works
of various fields among them machine learning, financial optimization, supply
chain management [8].
In online learning context, at each round t, we receive a DC loss function
ft = gt − ht with gt , ht are functions in Γ0 (S) and S is the parameter set of
the online learner. Then, we approximate the concave part −ht by its affine
majorization corresponding to taking zt ∈ ∂ht (xt ) and minimize the resulting
convex subproblem [3].
The subproblem of online DCA can take two forms. The first form is follow-
t learning rule [14], which means at round t, we minimize the cumulative
the-leader
loss i=1 fi . We can also minimize the current loss ft instead the cumulative one.
t
In short, we can write both learning rules above in a single formula as i=t0 fi ,
where t0 = 1 or t. Then, the online DCA scheme is given in Algorithm 2.
3 Autoregressive Neural Network for Online Forecasting

Problem
Time series analysis is engaged in analyzing the underlying dynamics of a col-

lection of successive values recorded in time called time series. The underlying
Algorithm 2. Online DCA

1: Initialization: Let θ1 ∈ S be the best guess, t0 ∈ {1, t}.
2: for t = 1, 2, 3, ..., T do
3: p(xt | θt ).
Receive question xt . Give prediction
4: Receive answer yt and suffer loss ti=t0 fi (θ) = ti=t0 gi (θ) − ti=t0 hi (θ).
5: Calculate wt ∈ ∂ht (θt ).
t t
6: Calculate θt+1 ∈ arg min i=t0 gi (θ) − i=t0 wi , θ : θ ∈ S .
7: end for
dynamics could be describe as a series of random variables {Tt }t∈N . A time

series then is a series {τt }t∈N of observed values of those random variables [15].
A common way to simulate a time series is to construct the process as a function
of past observed values, which is called autoregressive. When the autoregres-
sive function is linear, we name the model linear autoregressive. In literature,
linear autoregressive models are the most simple and oldest models for time
series modeling [16]. Assume that past values are used for the regression, we
denote by AR() the linear autoregressive model, which could be written as
τt = α0 + α1 τt−1 + ... + α τt− .
In many cases, linear models are not able to capture the nonlinearities in
real world applications [6]. To overcome this, nonlinear models are the alterna-
tive. Recently, artificial neural network has grown quickly as it has the universal
approximation property, which means it is able to approximate any nonlinear
function on a compact set [5]. We call the AR-integrated version of neural net-
work the autoregressive neural network (ARNN). In ARNN(), the linear AR()
is replaced with a nonlinear composite transformation of past values. In this
work, we consider a neural network with one hidden layer, which contains N
hidden nodes activated by ReLU activation function [9]. Then, the formula of
ARNN() model can be written as follows:
τt = b + max(0, xTt U + aT )V, (1)
where U ∈ R×N , V ∈ RN , a ∈ R , b ∈ R are the parameters of the network and

xt = (τt−1 , τt−2 , ..., τt− ) is the vector containing past values of the series.
Now we consider the online time series forecasting problem using ARNN()
to predict the value at current time of the series {τn }n∈N . The features and
labels of our dataset could be created as xt = (τt−1 , τt−2 , ..., τt− ), and yt = τt .
We choose the prediction function of ARNN() as in (1), which is p(xt | θ) =
max(0, xTt U + aT )V + b, where θ = (U, V, a, b). We note that the function max
in this work is applied to all elements if its input contains a vector, which means
if β = (β1 , β2 , ..., βd ) is a vector in Rd , then we have
max(0, β) = max(0, (β1 , . . . , βd )) = (max(0, β1 ), . . . , max(0, βd )).
To estimate the error between the predicted and true data, we use the
−insensitive loss: L (p(xt | θ), yt ) = max (0, |p(xt | θ) − yt | − ) , where is a
positive number. From now on, we use the notation ft to denote the objec-
tive function corresponding to a single data point (xt , yt ), which also means
ft (θ) = L (p(xt | θ), yt ).
In online setting,we have two learning rules: either minimizing the loss ft or
t
the cumulative loss i=1 fi at round t. Both choices can be written in a single
form as

t
t

min fi (θ) = max 0, max 0, xTi U + aT V + b − yi − , (2)
θ∈S
i=t0 i=t0
where t0 be either 1 or t.
In summary, we follow the online learning scheme 1 to receive question xt ,
predict p(xt | θ) then suffer the loss ft (θ) = L (p(xt | θ), yt ) and minimize the
problem (2). This process is repeated for t = 1, 2, ..., T in real time.
4 Solving the Problem (2) by Online DCA

In this section, we use online DCA to solve problem (2) in the following steps.
First, we find a DC decomposition for the objective function ft which corresponds
to the time stamp t. Using that decomposition, we solve (2) with t0 = t in
Sect. 4.2. Also in Sect. 4.2, two versions of online DCA is studied as well as the
online gradient descent algorithm. We solve (2) for the case t0 = 1 in Sect. 4.3,
which results in another online DC algorithm.
4.1 DC Decomposition and Subproblems of Online DCA

DC Decomposition We will find a DC decomposition for the loss function
fi (θ) = L (p(xi | θ), yi ) = max (0, |p(xt | θ) − yt | − ). We see that the loss func-
tion is the composition of the max, absolute value and p functions. Let φ be
a DC function which has a DC decomposition given by φ = φ1 − φ2 , we have
max (0, φ) = max (φ1 , φ2 ) − φ2 and |φ| = 2 max (φ1 , φ2 ) − (φ1 + φ2 ). Assume
further that p has a DC decomposition p = q − r and apply the above formulas
into fi , we obtain:
fi (θ) = max (0, |p(xi | θ) − yi | − )

= 2 max (q(xi | θ), r(xi | θ) + yi ) − (q(xi | θ) + r(xi | θ) + yi + )
=: gi − hi . (3)
Thus, the decomposition of fi = gi − hi is determined if we have q and r. In

order to find those functions, we first consider the prediction function
p(xt | θ) = max(0, xTt U + aT )V + b

= max(0, xTt U + aT )V + − max(0, xTt U + aT )V − + b, (4)
where V + = max(0, V ) and V − = − min(0, V ). We observe that the two first

terms of the above equation are products of two nonnegative, convex functions.
For two arbitrary nonnegative, convex functions u and v, we have a DC decom-

2 2 2
position of their product as follows: uv = (u+v)
2 − u +v
2 . Applying this formula
to (4), we obtain p(xt | θ) = q(xt | θ) − r(xt | θ), where
1 2 2
N
q(xt | θ) = b + [ max(0, xTt Uj + aTj ) + Vj+ + Vj− ]
2 j=1
1 2 2
N
r(xt | θ) = max(0, xTt Uj + aTj ) + Vj− + Vj+ (5)
2 j=1
Hence, substituting (5) into (3) gives us the DC decomposition of fi .
Subproblems Formulation t According totonline DCAscheme 2, at round t,

t
we receive the loss function i=t0 fi (θ) = i=t0 gi (θ) − i=t0 hi (θ) with gi and
hi being the DC components of fi that we found above. Then, we update the
parameters θt+1 of the prediction function by solving the following subproblem:
t
t
min gi (θ) − zi , θ , (6)
θ∈S
i=t0 i=t0
where zi ∈ ∂hi (θt ). There are two cases for this subproblem: t0 = t or t0 = 1. In
the following sections, we will consider each case in detail.
4.2 Learning Rule: min {ft (θ) : θ ∈ S} (Case t0 = t)

DCA-1a and DCA-1b The subproblem (6) becomes mins∈S gt (s) − zt , s.
We can use projected subgradient method to solve this problem as follows. First,
we choose θt as the initial point, which means s1 = θt . For clarity, we choose
the superscript number as the indication for iteration of subgradient method.
Let wtk ∈ ∂gt (sk ) be the subgradient of gt at point sk and αk be the step size at
iteration k, the update formula can be written as follows:

sk+1 = ProjS sk − αk wtk − zt . (7)
Although the above

formula is an explicit update, we have to repeat it until
convergence of sk for each round t in the online context. This nested loop
makes the convergence slower. A natural approach for this issue is to apply (7)
for only 1 iteration each round, which results in the Algorithm 3 (DCA-1a).
Solving the subproblem for only 1 iteration could reduce the heavy computa-
tion, but possibly leads to poor quality of solutions since the update parameter
θt+1 , which is also the solution of the subproblem at round t, is not optimal. To
balance between quality and time, we propose a combined strategy as follows. In
each round the first T0 online rounds, we solve the subproblem with K iterations.
Then, from round T0 + 1 to the last round T , we solve the subproblem with only
1 iteration as in DCA-1a. This strategy would improve the quality of solutions
Algorithm 3. DCA-1a
1: Initialization: θ1 ∈ S.
2: for t = 1, 2, 3, ..., T do
3: Receive question xt . Give prediction p(xt | θt ).
4: Receive answer yt and suffer loss ft (θ) = gt (θ) − ht (θ).
5: Choose step size αt > 0.
6: Calculate wt ∈ ∂gt (θt ) and zt ∈ ∂ht (θt ).
7: Calculate θt+1 = ProjS (θt − αt (wt − zt )).
8: end for
Algorithm 4. DCA-1b
1: Initialization: θ1 ∈ S and T0 , K ∈ N.
2: for t = 1, 2, 3, ..., T do
3: Receive question xt . Give prediction p(xt | θt ).
4: Receive answer yt and suffer loss ft (θ) = gt (θ) − ht (θ).
5: Calculate zt ∈ ∂ht (xt ).
6: if t ≤ T0 then
7: Solve minθ∈S {gt (θ) − zt , θ} by subgradient method with K iterations.
8: else
9: Solve minθ∈S {gt (θ) − zt , θ} by subgradient method with 1 iteration.
10: end if
11: end for
in the first T0 rounds, therefore leads to faster convergence. On the other hand,
T0 as we adjust T0 and
the computational time is kept in an acceptable threshold
K. In the view of regret bound, one might see that i=1 fi is bounded, so it
does not affect the sublinearity bound of the regret. The bound only depends on
DCA-1a, which is applied for the latter T − T0 rounds. This combined strategy
is describe in Algorithm 4 (DCA-1b).
Online Gradient Descent (OGD) The update formula of OGD at round t

with step size αt can be written as xt+1 = ProjS (xt − αt ∇ft (xt )) . From this
formula, we can see that in order to use OGD, one must have the gradient of the
objective function ft . ReLU, which is our activation function in the neural net-
work, is a non-differentiable function at 0. This means there exists a subset S ⊂ S
such that ft is not differentiable at any point θ in S. Although networks with
ReLU activation do not satisfy theoretical property of differentiability, gradient-
based methods are still used widely in practical deep learning problems [9]. In
implementation, one would choose the derivative value of ReLU at 0 as 0 or 1.
Convergence analysis of such implementation are referred to [10].
Now recall that we have decomposed ft into DC components gt − ht . Let
wt ∈ ∂gt (θt ) and zt ∈ ht (θt ) be the subgradients of gt and ht at θt . If we choose
the subgradient of ReLU at 0 as the same value as in case of OGD above, then
we obtain that wt − zt would be equal to the value of gradient of ft at θt . In this
Algorithm 5. DCA-2
1: Initialization: θ1 ∈ S.
2: for t = 1, 2, 3, ..., T do
3: p(xt | θt ).
Receive question xt . Give prediction
4: Receive answer yt and suffer loss ti=1 fi (θ) = ti=1 gi (θ) − ti=1 hi (θ).
5: Choose step size λt > 0.
6: Calculate wt ∈ ∂gt (θt ) and
zt ∈ ∂ht (θt ).
7: Calculate θt+1 = ProjS θt − λt ti=1 (wi − zi ) .
8: end for
case, the update formula of DCA-1a and OGD are exactly the same. Therefore,
in the numerical experiment, we consider DCA-1a and OGD as one algorithm.
More details about OGD for convex loss functions could be found in [14].
t
4.3 Learning Rule: min { i=1 }
fi (θ) : θ ∈ S (Case t0 = 1)
t
t
In this case, the subproblem (6) becomes mins∈S g
i=1 i (s) − z
i=1 i , s ,
where zi ∈ ∂hi (θi ). If we use subgradient method, first we initialize s1 = θt and
let αk be the step
size at iteration k. The update formula can be obtained as

t t
sk+1
= s −α
k k
i=1 wi −
k
i=1 zi , where wi ∈ ∂gi (s ). At each iteration k,
k k
we have to compute wik for all i in {1, 2, ..., t}. This makes the computation heavy,
even with only 1 iteration of subgradient method. An other approach is to replace
each gi with its piecewise linear approximation.
The linear approximation of gi at
s0 has the form of φ0i (s) = gi
(s0 )+ wi0 , s − s0
for all s in S, where wi
0
∈ ∂g 0
i (s ).
Assume we have a finite set s , s , ..., s ⊂ S. Then gi could be approximated
1 2 n
by the piecewise linear function as follows: gi (s) ≈ max{φji (s) : j ∈ {1, ..., n}}.
Put Gi (s) = max{φi,j (s) : j ∈ {1, ..., n}}, then the subproblem becomes

t
t
t
min Gi (s) − zi , s = max wij − zi , s : j ∈ {1, ..., n} .
s∈S
i=1 i=1 i=1
t each gi at the one point s = θi and the above

i
For simplicity, we just linearize
problem becomes: mins∈S i=1 wi − zi , s , where wi ∈ ∂gi (θi ). We solve this
subproblem using proximal point method with step size λt , of which the update
rule has the form:
t
s − θt 22 t
θt+1 ∈ arg min{ wi − zi , s + } = arg min{s − θt + λt (wi − zi ) 22 }.
s∈S
i=1
2λt s∈S
i=1
t
So we obtain θt+1 = ProjS θt − λt i=1 (wi − zi ) . With this update rule, we
have Algorithm 5 (DCA-2).
We conduct experiments on three algorithms on five time series datasets taken
from UCI machine learning repository1 . The experimental procedure is imple-
mented as follows. We perform the preprocessing by transforming the time
series {τt } into a new dataset in which the feature vector xt has the form
xt = (τt−4 , τt−3 , τt−2 , τt−1 ) , which we call a window of length 4. For each win-
dow, we take the current t-th value of the time series as the label, which is
yt = τt . In short, we can see the online model as using 4 past values to predict
the upcoming value. We choose T0 = 10 and K = 100 for DCA-1b. Mean square
error (MSE) is used as the measure for quality.
Results are reported in Table 1.
Table 1. Comparative results on time series datasets. L denotes the length of the time
series. Bold values correspond to best results for each dataset. We have chosen the
subgradients such that DCA-1a and OGD have the same update formulas.
Dataset MSE Time (s)

DCA-1a DCA-1b DCA-2 DCA-1a DCA-1b DCA-2
EU stock (L = 536) 2.957 1.518 1.679 0.015 0.025 0.037
NIKKEI stock (L = 536) 2.382 1.691 1.782 0.015 0.025 0.039
Appliances (L = 19735) 0.075 0.071 0.058 2.331 3.962 6.314
Temperature (L = 19735) 11.701 11.661 11.521 0.851 0.884 3.400
Pressure (L = 20000) 0.388 0.307 0.269 0.796 1.359 2.922
Comment. In term of MSE, DCA-2 is better than DCA-1a (or OGD) in all
datasets. This can be explained by the fact that DCA-2 minimizes the cumulative
loss, which gives more information for the online learner. DCA-1b outperforms
DCA-2 in datasets with more number of instances (Appliances, Temperature
and Pressure).
Regarding computational time, DCA-1a (or OGD) is the best due to its
light computation update formula. DCA-1b is slow since it has to loop over K
iterations of subgradient method for the first T0 rounds.
In summary, DCA-2 performs well on both quality and time. Although DCA-
1b has good MSEs on long time series, the computational time is large compared
to two other algorithms. DCA-1a (or OGD) is the fastest but with the lowest
quality.
6 Conclusion
This work presents an approach for online time series forecasting by using neural
networks. The resulting optimization problem of neural network with ReLU acti-
1
http://archive.ics.uci.edu/ml.
vation is nonconvex and nonsmooth. To handle that, we have proposed several

online DCAs. The effectiveness of those algorithms are shown in the experiments.
In future works, we plan to study more DC decompositions and optimization
strategies for improving the results. In addition, using deeper neural networks
in time series forecasting is an interesting problem that is worth further investi-
gation.
References
1. Anders, U., Korn, O., Schmitt, C.: Improving the pricing of options: a neural
network approach. J. Forecast. 17(5–6), 369–388 (1998)
2. Box, G.E., Jenkins, G.M., Reinsel, G.C., Ljung, G.M.: Time series analysis: fore-
casting and control. Wiley (2015)
3. Ho, V.T., Le Thi, H.A., Bui Dinh, C.: Online DC optimization for online binary
linear classification. In: Nguyen, N.T., Trawiński, B., Fujita, H., Hong, T.P. (eds.)
Intelligent Information and Database Systems, pp. 661–670. Springer, Berlin (2016)
4. Hornik, K.: Some new results on neural network approximation. Neural Netw. 6(8),
1069–1072 (1993)
5. Hornik, K., Stinchcombe, M., White, H.: Multilayer feedforward networks are uni-
versal approximators. Neural Netw. 2(5), 359–366 (1989)
6. Kantz, H., Schreiber, T.: Nonlinear Time Series Analysis, vol. 7. Cambridge Uni-
versity Press (2004)
7. Le Thi, H.A., Pham Dinh, T.: The DC (Difference of Convex functions) program-
ming and DCA revisited with DC models of real world nonconvex optimization
problems. Ann. Oper. Res. 133(1), 23–46 (2005)
9. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444
(2015)
10. Li, Y., Yuan, Y.: Convergence analysis of two-layer neural networks with ReLU
activation. In: Advances in Neural Information Processing Systems, pp. 597–607
(2017)
11. Medeiros, M.C., Teräsvirta, T., Rech, G.: Building neural network models for time
series: a statistical approach. J. Forecast. 25(1), 49–75 (2006)
12. Pan, X., Srikumar, V.: Expressiveness of rectifier networks. In: Proceedings of the
33rd International Conference on International Conference on Machine Learning.
ICML’16, vol. 48, pp. 2427–2435. JMLR.org (2016)
13. Pham Dinh, T., Le Thi, H.A.: Convex analysis approach to d.c. programming:
theory, algorithm and applications. Acta Math. Vietnam. 22(01) (1997)
14. Shalev-Shwartz, S., Singer, Y.: Online learning: Theory, Algorithms, and Applica-
tions (2007)
15. Shumway, R.H., Stoffer, D.S.: Time Series Analysis and its Applications (Springer
Texts in Statistics). Springer, Berlin (2005)
16. Yule, G.U.: On a method of investigating periodicities in disturbed series, with
special reference to Wolfer’s sunspot numbers. In: Philosophical Transactions of
the Royal Society of London. Series A, Containing Papers of a Mathematical or
Physical Character, vol. 226, pp. 267–298 (1927)
17. Zinkevich, M.: Online convex programming and generalized infinitesimal gradient
ascent. In: Proceedings of the 20th International Conference on Machine Learning
(ICML-03), pp. 928–936 (2003)
Parallel DC Cutting Plane Algorithms
for Mixed Binary Linear Program
Yi-Shuai Niu1,2(B) , Yu You1 , and Wen-Zhuo Liu3

1
School of Mathematical Sciences, Shanghai Jiao Tong University,
Shanghai, China
{niuyishuai,youyu0828}@sjtu.edu.cn
2
SJTU-Paristech Elite Institute of Technology, Shanghai Jiao Tong University,
Shanghai, China
3
Allée des techniques avancées, Ensta Paristech, 91120 Palaiseau, France
Abstract. In this paper, we propose a new approach based on DC (Dif-

ference of Convex) programming, DC cutting plane and DCA (DC Algo-
rithm) for globally solving mixed binary linear program (MBLP). Using
exact penalty technique, we can reformulate MBLP as a standard DC
program which can be solved by DCA. We establish the DC cutting plane
(DC cut) to eliminate local optimal solutions of MBLP provided by DCA.
Combining DC cut with classical cutting planes such as lift-and-project
and Gomory’s cut, we establish a DC cutting plane algorithm (DC-CUT
algorithm) for globally solving MBLP. A parallel DC-CUT algorithm is
also developed for taking the power of multiple CPU/GPU to get bet-
ter performance in computation. Preliminary numerical results show the
efficiency of our methods.
Keywords: DC programming · DCA · Mixed binary linear program ·

DC cut · Parallel DC-CUT algorithm
1 Introduction
Considering the mixed binary linear program (MBLP):
min f (x, y) := c x + d y
s.t. Ax + By ≥ b (P)
(x, y) ∈ {0, 1}n × Rq+ ,
where vectors c ∈ Rn , d ∈ Rq , b ∈ Rm , and matrices A ∈ Rm×n , B ∈ Rm×q .
The research is funded by the National Natural Science Foundation of China (Grant
No: 11601327) and by the Key Construction National “985” Program of China (Grant
No: WF220426001).
https://doi.org/10.1007/978-3-030-21803-4_34
Parallel DC Cutting Plane Algorithms for Mixed Binary Linear Program 331
MBLP is in general NP-hard which is well-known as one of Karp’s 21 NP-

complete problems [21]. Over the past several decades, a variety of optimiza-
tion techniques have been proposed for MBLP. There are generally two kinds
of approaches: Exact algorithms and Heuristic methods. The exact algorithms
include cutting-plane method, branch-and-bound, and column generation etc.
These methods intend to find the approximate global optimal solution, but due
to NP-hardness, they are often inefficient in practice especially for large-scale
cases. So heuristic methods are proposed instead, such as tabu search, simulated
annealing, ant colony, hopfield neural networks etc. However, it is usually impos-
sible to qualify the globality of the solutions returned by heuristic methods.
Finding a global optimal solution for MBLP is very expensive in computation.
In practice, we are often interested in efficient local optimization approaches.
Among these methods, DCA is a promising one to provide good quality local
(often global) optimal solution in various practical applications (see e.g., [5,9,18,
19]). Applying DCA to general mixed-integer linear optimization is firstly studied
in [13] (where the integer variables are not only binaries), and extended for
solving mixed-integer nonlinear programming [14,15] within various applications
including scheduling [6], network optimization [22], and finance [7,20] etc. This
algorithm is based on continuous representation techniques for integer set, exact
penalty theorem, DCA and Branch-and-Bound (BB) algorithms. Recently, the
author developed a parallel BB framework (called PDCABB) in order to use the
power of multiple CPU/GPU for improving DCABB [17].
Besides the combination of DCA and BB, another global approach, called
DCA-CUT, is a cutting plane method based on constructing cutting planes
(called DC cuts) from the solutions of DCA. This kind of method is firstly
studied in [11] and applied to several real-world problems as the bin-parking
problem [10] and the scheduling problem [12]. However, DCA-CUT algorithm
is not well-constructed since there exist some cases where DC cut is uncon-
structible. Due to this drawback, Niu proposed in [16] a hybrid approach to
combine the constructible DC cut in DCABB for improving the lower bounds.
In our paper, we will revisit DC cutting plane technique and discuss about
the constructible DC cuts. For unconstructible cases, we propose to combine the
classical global cuts such as lift-and-project cut and Gomory’s cut to establish
a hybrid cutting plane algorithm called DC-CUT algorithm to globally solve
MBLP. A parallel version of our DC-CUT algorithm is also developed for better
performance. Moreover, variant algorithms with more cutting planes constructed
in each iteration are proposed.
The paper is organized as follows: Section 2 presents DC programming for-
mulation and DCA for MBLP. In Sect. 3, we introduce the DC cutting planes.
DC-CUT algorithms (with and without parallelism) and their variants are pro-
posed in the next section. The numerical experimental results are reported in
Sect. 5. Some conclusions and perspectives are discussed in the last section.
332 Y.-S. Niu et al.
2 DC Programming Formulation and DCA for MBLP

Let S be the feasible set of (P), and y be upper bounded by ȳ. Let K be the
linear relaxation of S defined by K = {(x, y) : Ax + By ≥ b, x ∈ [0, 1]n × Rq+ }.
The linear relaxation of (P) denoted by R(P ) is defined as:
min{f (x, y) : (x, y) ∈ K},
whose optimal value denoted by l(P ) is a lower bound of (P).

The continuous representation technique for integer set {0, 1}n consists of
finding a continuous DC function1 p : Rn → R such that
{0, 1}n ≡ {x : p(x) ≤ 0}.
For integer set {0, 1}n , we often use the piecewise linear function
n
p(x) = min{xi , 1 − xi },
i=1
n
then S = K ∩ {x ∈ R : p(x) ≤ 0} × [0, ȳ].
Based on exact penalty theorem [4,8], if K is nonempty, then there exists a
finite number t0 ≥ 0 such that for all t > t0 , the problem (P) is equivalent to:
min τt (x, y) := f (x, y) + tp(x)

(P t )
s.t. (x, y) ∈ K.
A DC decomposition of τt (x, y) is given by:
0 −[−f (x, y) − tp(x)].

τt (x, y) =

g(x,y) h(x,y)
A subgradient (v, w) ∈ ∂h(x, y) can be chosen as v = −c + u and w = −d with

t if xi ≥ 12 ;
ui = (1)
−t otherwise.
for all i ∈ {1, · · · , n}.

DCA for solving the problem (P t ) is described in Algorithm 1.
In view of the polyhedral convexity of h, the problem (P t ) is a polyhedral
DC program. According to the convergence theorem of DCA for polyhedral DC
program [5,18], it follows that:
(1) Algorithm 1 generates a sequence {(xk , y k )} ⊆ V (K)2 which converges to a
KKT point (x∗ , y ∗ ) of (P t ) after finite iterations.
(2) The sequence {f (xk , y k )}∞ ∗ ∗
k=1 is monotonically decreasing to f (x , y ).
(3) If x∗i = 12 , ∀i ∈ {1, · · · , n}, then (x∗ , y ∗ ) is a local minimizer of (P t ).
1
A function f : Rn → R is called DC if there exist two convex functions g and h
(called DC components) such that f = g − h.
2
V (K) denotes the vertex set of the polyhedron K.
Algorithm 1. DCA for (P t )

Input: Initial point (x0 , y 0 ) ∈ Rn × Rq+ ; large enough penalty parameter t > 0;
tolerance ε1 , ε2 > 0.
Output: Optimal solution x∗ and optimal value f ∗ ;
1 Initialization: Set k = 0.
2 Step 1: Compute (v k , wk ) ∈ ∂h(xk , y k ) via (1);
3 Step 2: Solve the linear program by simplex algorithm to find a vertex solution
(xk+1 , y k+1 ) ∈ arg min{−(v k , wk ), (x, y) : (x, y) ∈ K};
4 Step 3: Stopping check:
5 if (xk+1 , y k+1 ) − (xk , y k ) ≤ ε1 or |τt (xk+1 , y k+1 ) − τt (xk , y k )| ≤ ε2 then
6 (x∗ , y ∗ ) ← (xk+1 , y k+1 ); f ∗ ← τt (xk+1 , y k+1 ); return;
7 else
8 k ← k + 1; Goto Step 1.
9 end
3 DC Cutting Planes
3.1 Valid Inequalities Based on DC Programming
Let us denote u∗ = (x∗ , y ∗ ) ∈ K, I = {1, · · · , n} and define the index sets:
J0 (u∗ ) = {j ∈ I : x∗j ≤ 1/2}; J1 (u∗ ) = I \ J0 (u∗ ).
The affine function lu∗ is defined by:

lu∗ (u) = ∗
xj + (1 − xj ).
j∈J0 (u ) j∈J1 (u∗ )
To simplify the notations, we will identify lu∗ (x) with lu∗ (u) and p(x) with p(u).
Lemma 1. (see [12]) ∀u∗ ∈ K, we have:

(i) lu∗ (u) ≥ p(u) ≥ 0, ∀u ∈ K.
(ii) lu∗ (u∗ ) = p(u∗ ), and in particular, if u∗ ∈ S, then lu∗ (u∗ ) = p(u∗ ) = 0.
Theorem 1. (see [11, 12]) There exists a finite number t1 ≥ 0 such that for all
t > t1 , if u∗ ∈ V (K) \ S is a local minimizer of (P t ), then
lu∗ (u) ≥ lu∗ (u∗ ), ∀u ∈ K.
3.2 DC Cut from Infeasible Solution
Let u∗ be infeasible solution of (P t ) obtained by DCA (i.e., u∗ ∈

/ S). We have
the following two cases:
(1) All components of x∗ are different to 1

2 and p(u∗ ) ∈
/ Z.
(2) Otherwise.
Case 1: When case 1 occurs, a cutting plane to cut off u∗ from K is constructed

as in Theorem 2.
Theorem 2. Let u∗ ∈ / S be a local minimizer of (P t ) and p(u∗ ) ∈

/ Z, then the
following inequality provides a cutting plane to separate u∗ and S
lu∗ (u) ≥ lu∗ (u∗ ). (2)
Proof. First, it follows from Lemma 1 that lu∗ (u∗ ) = p(u∗ ), then p(u∗ ) ∈/ Z
implies lu∗ (u∗ ) < lu∗ (u∗ ). Thus, u∗ unsatisfies the inequality (2). Second,
when t is sufficiently large, it follows from Theorem 1 that for all u ∈ S, Z
lu∗ (u) ≥ lu∗ (u∗ ) ∈
/ Z, hence, lu∗ (u∗ ) ≥ lu∗ (u∗ ), ∀u ∈ S.
Case 2: In this case, we will use classical cuts to separate u∗ and S. Lift-and-
project (LAP) cut, one of the classical cuts, is introduced in Algorithm 2, the
reader can refer to [1,2] for more details.
Algorithm 2. LAP cut

Input: u∗ ∈ V (K) \ S.
Output: LAP cut.
1 Step 1: (Index selection) Select an index j ∈ {1, · · · , n} with x∗j ∈
/ Z;
2 Step 2: (Cut generation)
3 Set Cj be an m × (n + q − 1) matrix obtained from the matrix [A|B] by
removing the j-th column aj ;
4 j be an m × (n + q) zero matrix with only j-th column being aj − b;
Set D
5
Set Cj ← [A|B] − D j ;
6 Solve the linear program:
j + v C
max{vb − (wD j )u∗ : wCj − vCj = 0, (w, v) ≥ 0}
to get its solution (w∗ , v ∗ ).

7 The LAP cut is defined by (w∗ D j + v ∗ C
j )u ≥ v ∗ b which separates u∗ and S.
3.3 DC Cut from Feasible Solution
Let u∗ be a feasible solution of (P t ) obtained by DCA (i.e., u∗ ∈ S), then it

must be a local minimizer of (P t ). The next theorem provides a cutting plane.
Theorem 3. Let u∗ ∈ S be a feasible local minimizer of (P t ), then the following

inequality cuts off u∗ from S and remains all better feasible solutions in S.
lu∗ (u) ≥ 1. (3)

Proof. First, since lu∗ (u∗ ) = 0, then u∗ unsatisfies the inequality (3). Second,
let C1 = {(x∗ , y) : (x∗ , y) ∈ K}, the following problem
min{τt (u) : u ∈ C1 } (P1 )
is reduced to the linear program
min{f (u) : u ∈ C1 } (P2 )
since p(u) = 0, ∀u ∈ C1 . Moreover, since u∗ is a local minimizer of (P t ), then

it is also a local minimizer of (P1 ) and (P2 ), and any local minimizer of linear
program is a global minimizer, thus, u∗ globally solve (P2 ), i.e., f (u∗ ) ≤ f (u),
∀u ∈ C1 . Therefore, all better feasible solutions in S are not included in C1
which implies lu∗ (u) ≥ 1, ∀u ∈ S \ C1 .
4 DC-CUT Algorithms
In this section, we will establish DC-CUT algorithms based on DC cuts and the
classical cutting planes presented in previous section.
4.1 DC-CUT Algorithm Without Parallelism

DC-CUT algorithm consists of constructing in each iteration a DC cut or a
classical cut to reduce progressively the set K. During the iterations, once we
find a feasible solution in S, then we update the incumbent solution, and once
the linear relaxation on the reduced set provides a feasible optimal solution in
S, or the reduced set is empty, or the gap between the lower bound and the
upper bound is small enough, then we terminate our algorithm and return the
best feasible solution.
Let us denote V k the set of cutting planes constructed in the k-th iteration.
Let K 0 = K, S 0 = S, then we update K k+1 = K k ∩ V k , and S k+1 = S k ∩ V k .
We refer to the linear relaxation defined on K k as R(P k ) given by
min{f (u) : u ∈ K k },
and the DC program defined on K k is (DCP k ) defined by:
min{τt (u) : u ∈ K k }.
The DC-CUT algorithm is described in Algorithm 3.
4.2 Parallel-DC-CUT Algorithm

Note that each restarting of DCA in line 7 of DC-CUT Algorithm 3 yields either
a DC cut or a LAP cut (the LAP cut could also be other classical cuts). Thus,
if DCA is applied several times in one iteration with different initial points, it is
potential to introduce several cutting planes in one iteration to reduce quickly
the lower bound, and provides more candidates for updating incumbent upper
bound. Consider also the power of multiple CPU/GPU, we propose to start DCA
simultaneously from random initial points.
The differences between Parallel-DC-CUT Algorithm and DC-CUT
Algorithm 3 are mainly focused on the codes from line 7 to line 19. Suppos-
ing that we want to use s parallel workers, then at the line 7, we can choose
s random initial points in [0, 1]n × [0, ȳ] for starting DCA simultaneously and
construct cutting planes respectively. Once the parallel block is terminated at
line 19, we collect all created cutting planes in V k to update the sets K k and S k .
Algorithm 3. DC-CUT Algorithm

Input: Problem (P); penalty parameter t; tolerence ε > 0;
Output: Optimal solution (xopt , yopt ) and optimal value f val;
1 Initialize: k ← 0; K 0 ← K; S 0 ← S; U B ← +∞; xopt = null; yopt = null;
2 Solve R(P 0 ) to obtain its optimal solution u = (x, y) and update LB;
3 if x ∈ S then
4 xopt ← x; yopt ← y; f val ← LB; return;
5 else
6 while |LB − U B| < ε do
7 Set u as the initial point for starting DCA for (DCP k ), and get its
solution u∗ = (x∗ , y ∗ );
8 if u∗ ∈
/ S k then
9 if p(u∗ ) ∈
/ Z && x∗i = 12 ∀i ∈ {1, . . . , n} then
10 use inequality (2) to add a dc cut to V k ;
11 else
12 add a LAP cut to V k ;
13 end
14 else
15 if u∗ is a better feasible solution then
16 xopt ← x∗ ; yopt ← y ∗ ; U B ← f (x∗ , y ∗ ); f val ← U B;
17 end
18 use inequality (3) to add a dc cut to V k ;
19 end
20 K k+1 ← K k ∩ V k ; S k+1 ← S k ∩ V k ; k ← k + 1;
21 solve R(P k ) to obtain u = (x, y) and LB;
22 if R(P k ) is infeasible LB ≥ U B then
23 return the current best solution (xopt , yopt ) and f val;
24 else
25 if u ∈ S k && LB < U B then
26 xopt ← x; yopt ← y; U B ← LB; f val ← U B; return;
27 end
28 end
29 end
30 end
4.3 Variant DC-CUT and Parallel-DC-CUT Algorithms

More efficient algorithms could be derived by introducing more cutting planes
during each iteration in order to reduce the set K more quickly. For example,
instead of adding one cut in each iteration of DC-CUT Algorithm, we can add
several cuts including DC cut, LAP cut and Gomory’s cut etc if they are con-
structible.
The advantage of this strategy can help to update lower bounds more quickly.
However, more cutting planes increase the number of constraints more quickly
which will potentially increase the difficulty for solving the subproblems. More-
over, it is also possible that some of these cuts could be redundant or inefficient
(lazy cuts) for updating lower bound. Therefore, we should switch off the vari-
ant strategy when the lower bound updating rate is too small, and turn on this
strategy when the updating rate is big enough.
In this section, we report some numerical results of our proposed algorithms. Our
algorithms are implemented in MATLAB, and using parfor for parallel comput-
ing. The linear subproblems are solved by Gurobi 8.1.0 [3]. The experiments are
performed on a laptop equipped with 2 Intel i5-6200U 2.30GHz CPU (4 cores)
and 8 GB RAM, thus we use 4 workers for tests in parallel computing.
(a) DC-CUT (b) Parallel-DC-CUT
(c) Variant DC-CUT (d) Variant Parallel-DC-CUT
Fig. 1. Results of different DC-CUT Algorithms

We first illustrate the test results for a pure binary linear program exam-
ple with 10 binary variables and 10 linear constraints. The optimal value is
0. Figure 1 illustrates the updates of upper bounds (solid cycle line) and lower
bounds (dotted square line) with respect to the iterations of four DC-CUT Algo-
rithms (DC-CUT, Parallel-DC-CUT, Variant DC-CUT, and Variant Parallel-
DC-CUT Algorithm). Comparing the cases (b) and (d) with parallelism to the
cases (a) and (c) without parallelism, we observe that by introducing paral-
lelism, DCA can find a global optimal solution more quickly, and the number
of required iterations is reduced. The computing times for these four algorithms
are respectively (a) 3.4s, (b) 2.3s, (c) 2.6s and (d) 1.6s. Clearly, the parallel
cases (b) and (d) are faster than the cases (a) and (c) without parallelism, and
the variant methods (c) and (d) with more cutting planes introduced in each
iteration performed better in general. The fastest algorithm is the last case (d).
Moreover, in this test example, it is possible to produce negative gap as in
case (c), since DC cut is a kind of local cut which can cut off some feasible
solutions such that the lower bound on the reduced set could be greater than
the current upper bound. This feature is particular for DC cut which is quite
different to the classical global cuts such as lift-and-project. Therefore, DC cut
can often provide deeper cut than the global cuts, so it plays an important role
in accelerating the convergence of the cutting plane method.
Another important feature of our algorithms is the possibility to find a global
optimal solution without small gap between upper and lower bound. We can
observe that the cases (a), (b) and (d) are terminated without small gap. This
is due to the fact that the introduction of cutting planes yields an empty set,
thus no more computations are required.
More numerical test results for large-scale cases will be reported in the full-
length paper.
6 Conclusion and Perspectives
In this paper, we have investigated the construction of DC cuts and established

four DC-CUT algorithms with and without parallelism for solving MBLP. DC
cut is a local cut which often provides a deeper cutting effect than classical
global cuts, and helps to terminate our cutting plane algorithms more quickly
without reducing to zero gap between upper and lower bounds. By introducing
parallelism and adding more different types of cutting planes in each iteration,
the performance of DC-CUT algorithms are significantly improved.
Concerning on future works, we need more tests of our algorithms comparing
with the state of the arts MBLP solvers such as Gurobi and Cplex on large-scale
cases and real-world applications. Next, we have to improve our algorithms by
introducing different types of cutting planes such as Gomory’s cut, mixed-integer
rounding cut, knapsack cut etc. It is worth to investigate the performance of DC-
CUT algorithms when introducing more global cuts in each iteration. Moreover,
we will extend DC cut to general integer cases and nonlinear cases.
References
1. Balas, E., Ceria, S., Cornuéjols, G.: A lift-and-project cutting plane algorithm for
mixed 0–1 programs. Matematical programming. 58, 295–324 (1993)
2. Cornuéjols, G.: Valid inequalities for mixed integer linear programs. Math. Pro-
gram. 112(1), 3–44 (2008)
3. Gurobi 8.1.0. http://www.gurobi.com
4. Le Thi, H.A., Pham, D.T., Le Dung, M.: Exact penalty in dc programming. Viet-
nam J. Math. 27(2), 169–178 (1999)
5. Le Thi, H.A., Pham, D.T.: The DC (difference of convex functions) programming
and DCA revisited with DC models of real world nonconvex optimization problems.
Ann. Oper. Res. 133, 23–46 (2005)
6. Le Thi, H.A., Nguyen, Q.T., Nguyen, H.T., Pham, D.T.: Solving the earliness
tardiness scheduling problem by DC programming and DCA. Math. Balk. 23(3–
4), 271–288 (2009)
7. Le Thi, H.A., Moeini, M., Pham, D.T.: Portfolio selection under downside risk mea-
sures and cardinality constraints based on DC programming and DCA. Comput.
Manag. Sci. 6(4), 459–475 (2009)
8. Le Thi, H.A., Pham, D.T., Huynh, V.N.: Exact penalty and error bounds in dc
9. Le Thi, H.A., Pham, D.T.: DC programming and DCA: thirty years of develop-
ments. Math. Program. 169(1), 5–68 (2018)
10. Ndiaye, B.M., Le Thi, H.A., Pham, D.T., Niu, Y.S.: DC programming and DCA for
large-scale two-dimensional packing problems. In: Pan, J.S., Chen, S.M., Nguyen,
N.T. (eds.) Intelligent Information and Database Systems, LNCS, vol. 7197, pp.
321–330, Springer, Berlin (2012). https://doi.org/10.1007/978-3-642-28490-8 34
11. Nguyen, V.V.: Méthodes exactes pour l’optimisation DC polyédrale en variables
mixtes 0-1 basées sur DCA et des nouvelles coupes. Ph.D. thesis, INSA de Rouen
(2006)
12. Nguyen, Q.T.: Approches locales et globales basées sur la programmation DC et
DCA pour des problèmes combinatoires en variables mixtes 0–1, applications à la
planification opérationnelle. These de doctorat dirigée par Le Thi H.A, Informa-
tique Metz (2010)
13. Niu, Y.S., Pham, D.T.: A DC Programming Approach for Mixed-Integer Linear
Programs. In: Le Thi, H.A., Bouvry, P., Pham, D.T. (eds.) Modelling, Computa-
tion and Optimization in Information Systems and Management Sciences (MCO
2008), Communications in Computer and Information Science, vol. 14, pp. 244–
253. Springer, Berlin (2008). https://doi.org/10.1007/978-3-540-87477-5 27
14. Niu, Y.S.: Programmation DC & DCA en Optimisation Combinatoire et Optimi-
sation Polynomiale via les Techniques de SDP–Codes et Simulations Numériques.
Ph.D. thesis, INSA-Rouen, France (2010)
15. Niu, Y.S., Pham D.T.: Efficient DC programming approaches for mixed-integer
quadratic convex programs. In: International Conference on Industrial Engineering
and Systems Management (IESM 2011), pp. 222–231 (2011)
16. Niu, Y.S.: On combination of DCA branch-and-bound and DC-Cut for solving
mixed 0-1 linear program. In: 21st International Symposium on Mathematical Pro-
gramming (ISMP 2012). Berlin (2012)
17. Niu, Y.S.: A parallel branch and bound with DC algorithm for mixed integer
optimization. In: 23rd International Symposium in Mathematical Programming
(ISMP 2018). Bordeaux, France (2018)
18. Pham, D.T., Le Thi, H.A.: Convex analysis approach to D.C. programming: theory,
algorithm and applications. Acta Math. Vietnam. 22(1), 289–355 (1997)
19. Pham, D.T., Le Thi, H.A.: A D.C. optimization algorithm for solving the trust-
20. Pham, D.T., Le Thi, H.A., Pham, V.N., Niu, Y.S.: DC programming approaches
for discrete portfolio optimization under concave transaction costs. Optim. Lett.
10(2), 261–282 (2016)
21. Karp, R.M.: Reducibility among combinatorial problems. In: Miller, R.E.,
Thatcher, J.W. (eds.) Complexity of Computer Computations, The IBM Research
Symposia Series, pp. 85–103. Springer, Boston (1972). https://doi.org/10.1007/
978-1-4684-2001-2 9
22. Schleich, J., Le Thi, H.A., Bouvry, P.: Solving the minimum M-dominating set
problem by a continuous optimization approach based on DC programming and
DCA. J. Comb. Optim. 24(4), 397–412 (2012)
Sentence Compression via DC
Programming Approach
Yi-Shuai Niu1,2(B) , Xi-Wei Hu2 , Yu You1 , Faouzi Mohamed Benammour1 ,

and Hu Zhang1
1
School of Mathematical Sciences, Shanghai Jiao Tong University, Shanghai, China
niuyishuai@sjtu.edu.cn
2
SJTU-Paristech Elite Institute of Technology, Shanghai Jiao Tong University,
Shanghai, China
Abstract. Sentence compression is an important problem in natural

language processing. In this paper, we firstly establish a new sentence
compression model based on the probability model and the parse tree
model. Our sentence compression model is equivalent to an integer lin-
ear program (ILP) which can both guarantee the syntax correctness of
the compression and save the main meaning. We propose using a DC
(Difference of convex) programming approach (DCA) for finding local
optimal solution of our model. Combing DCA with a parallel-branch-
and-bound framework, we can find global optimal solution. Numerical
results demonstrate the good quality of our sentence compression model
and the excellent performance of our proposed solution algorithm.
Keywords: Sentence compression · Probability model ·

Parse Tree Model · DCA · Parallel-branch-and-bound
1 Introduction
The recent years have been known by the quick evolution of the artificial intel-
ligence (AI) technologies, and the sentence compression problems attracted the
attention of researchers due to the necessity of dealing with a huge amount of
natural language information in a very short response time. The general idea of
sentence compression is to make a summary with shorter sentences containing
the most important information while maintaining grammatical rules. Nowadays,
there are various technologies involving sentence compression as: text summa-
rization, search engine and question answering etc. Sentence compression will be
a key technology in future human-AI interaction systems.
There are various models proposed for sentence compression. The paper of
Jing [3] could be one of the first works addressed on this topic with many rewrit-
ing operations as deletion, reordering, substitution, and insertion. This approach
The research is funded by Natural Science Foundation of China (Grant No: 11601327)
and by the Key Construction National “985” Program of China (Grant No:
WF220426001).
https://doi.org/10.1007/978-3-030-21803-4_35
is realized based on multiple knowledge resources (such as WordNet and parallel

corpora) to find the pats that can not be removed if they are detected to be
grammatically necessary by using some simple rules. Later, Knight and Marcu
investigated discriminative models [4]. They proposed a decision-tree model to
find the intended words through a tree rewriting process, and a noisy-channel
model to construct a compressed sentence from some scrambled words based on
the probability of mistakes. MacDonald [12] presented a sentence compression
model using a discriminative large margin algorithm. He ranks each candidate
compression using a scoring function based on the Ziff-Davis corpus using a
Viterbi-like algorithm. The model has a rich feature set defined over compres-
sion bigrams including parts of speech, parse trees, and dependency informa-
tion, without using a synchronous grammar. Clarke and Lapata [1] reformu-
lated McDonald’s model in the context of integer linear programming (ILP) and
extended with constraints ensuring that the compressed output is grammati-
cally and semantically well formed. The corresponding ILP model is solved by
branch-and-bound algorithm.
In this paper, we will propose a new sentence compression model to both
guarantee the grammatical rules and preserve main meaning. The main contri-
butions in this work are: (1) Taking advantages of Parse tree model and Proba-
bility model, we hybridize them to build a new model that can be formulated as
an ILP. Using the Parse tree model, we can extract the sentence truck, then fix
the corresponding integer variables in the Probability model to derive a simpli-
fied ILP with improved quality of the compressed result. (2) We propose to use
a DC programming approach called PDCABB (an hybrid algorithm combing
DCA with a parallel branch-and-bound framework) developed by Niu in [17] for
solving our sentence compression model. This approach can often provide a high
quality optimal solution in a very short time.
The paper is organized as follows: The Sect. 2 is dedicated to establish hybrid
sentence compression model. In Sect. 3, we will present DC programming app-
roach for solving ILP. The numerical simulations and the experimental setup
will be reported in Sect. 4. Some conclusions and future works will be discussed
in the last section.
2 Hybrid Sentence Compression Model

Our sentence compression model is based on an Integer Linear Programming
(ILP) probability model [1], and a parsing tree model. In this section, we will
give a brief introduction of the two models, and propose our new hybrid model.
2.1 ILP Probability Model
Let x = {x1 , x2 , . . . , xn } be a sentence with n ≥ 2 words.1 We add x0 =‘start’ as

the start token and xn+1 =‘end’ as the end token.
1
Punctuation is also deemed as word.
Sentence Compression via DC Programming Approach 343
The sentence compression is to choose a subset of words in x for maximizing

its probability to be a sentence under some restrictions to the allowable trigram
combinations. This probability model can be described as an ILP as follows:
Decision variables: We introduce the binary decision variables δi , i ∈ [[1, n]]
for2 each word xi as: δi = 1 if xi is in a compression and 0 otherwise. In order to
take context information into consideration, we introduce the context variables
(α, β, γ) such that: ∀i ∈ [[1, n]], we set αi = 1 if xi starts a compression and 0
otherwise; ∀i ∈ [[0, n − 1]] , j ∈ [[i + 1, n]], we set βij = 1 if the sequence xi , xj
ends a compression and 0 otherwise; and ∀i ∈ [[0, n − 2]] , j ∈ [[i + 1, n − 1]] , k ∈
[[j + 1, n]], we set γijk = 1 if sequence xi , xj , xk is in a compression and 0
3 2
otherwise. There are totally n +3n6 +14n binary variables for (δ, α, β, γ).
Objective function: The objective function is to maximize the probability of
the compression computed by:
n
n−2
n−1
n

f (α, β, γ) = αi P (xi |start) + γijk P (xk |xi , xj )
i=1 i=1 j=i+1 k=j+1
n−1
n

+ βij P (end|xi , xj )
i=0 j=i+1
where P (xi |start) stands for the probability of a sentence starting with xi ,
P (xk |xi , xj ) denotes the probability that xi , xj , xk successively occurs in a sen-
tence, and P (end|xi , xj ) means the probability that xi , xj ends a sentence. The
probability P (xi |start) is computed by bigram model, and the others are com-
puted by trigram model based on some corpora.
Constraints: The following sequential constraints will be introduced to restrict
the possible trigram combinations:
Constraint 1 Exactly one word can begin a sentence.
n

αi = 1. (1)
i=1
Constraint 2 If a word is included in a compression, it must either start the

sentence, or be preceded by two other words, or be preceded by the ‘start’ token
and one other word.
k−2
k−1

δk − αk − γijk = 0, ∀k ∈ [[1, n]] . (2)
i=0 j=1
2
[[m, n]] with m ≤ n stands for the set of integers between m and n.
Constraint 3 If a word is included in a compression, it must either be preceded

by one word and followed by another, or be preceded by one word and end the
sentence.
j−1
n j−1

δj − γijk − βij = 0, ∀j ∈ [[1, n]] . (3)
i=0 k=j+1 i=0
Constraint 4 If a word is in a compression, it must either be followed by two

words, or be followed by one word and end the sentence.
n−1
n
n
i−1

δi − γijk − βij − βhi = 0, ∀i ∈ [[1, n]] . (4)
j=i+1 k=j+1 j=i+1 h=0
Constraint 5 Exactly one word pair can end the sentence.

n−1
n

βij = 1. (5)
i=1 j=i+1
Constraint 6 The length of a compression should be bounded.

n

l≤ δi ≤ ¯l. (6)
i=1
with given lower and upper bounds of the compression l and ¯l.
Constraint 7 The introducing term for preposition phrase (PP) or subordinate
clause (SBAR) must be included in the compression if any word of the phrase
is included. Otherwise, the phrase should be entirely removed. Let us denote
Ii = {j : xj ∈ PP/SBAR, j = i} the index set of the words included in PP/SBAR
leading by the introducing term xi , then

δj ≥ δi , δi ≥ δj , ∀j ∈ Ii . (7)
j∈Ii
ILP probability model: The optimization model for sentence compression is

summarized as a binary linear program as:
n3 +3n2 +14n
max{f (α, β, γ) : (1)−(7), (α, β, γ, δ) ∈ {0, 1} 6 }. (8)
with O(n3 ) binary variables and O(n) linear constraints.

The advantage of this model is that its solution will provide a compression
with maximal probability based on the trigram model. However, there is no
information about syntactic structures of the target sentence, so it is possible to
generate ungrammatical sentences. In order to overcome this disadvantage, we
propose to combine it with the parse tree model presented below.
2.2 Parse Tree Model

A parse tree is an ordered, rooted tree which reflects the syntax of the input lan-
guage based on some grammar rules (e.g. using CFG syntax-free grammar). For
constructing a parse tree in practice, we can use a nature language processing
toolkit NLTK [19] in Python. Based on NLTK, we have developed a CFG gram-
mar generator which helps to generate automatically a CFG grammar based on
a target sentence. A recursive descent parser can help to build a parse tree.
For example, the sentence “The man saw the dog with the telescope.” can
be parsed as in Fig. 1. It is observed that a higher level node in the parse tree
indicates more important sentence components (e.g., the sentence S consists of
a noun phrase NP, a verb phrase VP, and a symbol SYM), whereas a lower
node tends to carry more semantic contents (e.g., the proposition phrase PP is
consists of a preposition ‘with’, and a noun phrase ‘the telescope’). Therefore, a
parse tree presents the clear structure of a sentence in a logical way.
Fig. 1. Parse tree example
Sentence compression can be also considered as finding a subtree which

remains grammatically correct and containing main meaning of the original sen-
tence. Therefore, we can propose a procedure to delete some nodes in the parse
tree. For instance, the sentence above can be compressed as “The man saw the
dog.” by deleting the node PP.
2.3 New Hybrid Model: ILP-Parse Tree Model

Our proposed model for sentence compression, called ILP-Parse Tree Model
(ILP-PT), is based on the combination of the two models described above. The
ILP model will provide some candidates for compression with maximal proba-
bility, while the parse tree model helps to guarantee the grammar rules and keep
the main meaning of the sentence. This combination is described as follows:
Step 1 (Build ILP probability model): Building the ILP model as in for-
mulation (8) for the target sentence.
Step 2 (Parse Sentence): Building a parse tree as described in Subsect. 2.2.
Step 3 (Fix variables for sentence trunk): Identifying the sentence trunk in
the parse tree and fixing the corresponding integer variables to be 1 in ILP model.
This step helps to extract the sentence trunk by keeping the main meaning of
the original sentence while reducing the number of binary decision variables.
More precisely, we will introduce for each node Ni of the parse tree a label
sNi taking the values in {0, 1, 2}. A value 0 represents the deletion of the node;
1 represents the reservation of the node; whereas 2 indicates that the node can
either be deleted or be reserved. We set these labels as compression rules for
each CFG grammar to support any sentence type of any language.
For the word xi , we go through all its parent nodes till the root S. If the
traversal path contains 0, then δi = 0; else if the traversal path contains only 1,
then δi = 1; otherwise δi will be further determined by solving the ILP model.
The sentence truck is composed by the words xi whose δi are fixed to 1. Using
this method, we can extract the sentence trunk and reduce the number of binary
variables in ILP model.
Step 4 (Solve ILP): Applying an ILP solution algorithm to solve the simplified
ILP model derived in Step 3 and generate a compression. In the next section,
we will introduce a DC programming approach for solving ILP.
3 DC Programming Approach for Solving ILP
Solving an ILP is in general NP-hard. A classical and most frequently used

method is branch-and-bound algorithm as in [1]. Gurobi [2] is currently one of
the best ILP solvers using branch-and-bound combing various techniques such
as presolve, cutting planes, heuristics and parallelism etc.
In this section, we will introduce a Difference of Convex (DC) program-
ming approach, called Parallel-DCA-Branch-and-Bound (PDCABB), for solving
this model. Combing DCA and Branch-and-Bound without parallelism (called
DCABB) is firstly proposed for solving zero-one programming problem in [6],
which is later applied in strategic supply chain design problem [18]. DCABB
proposed for solving general mixed-integer linear programming (MILP) is devel-
opped in [13], and extended for solving mixed-integer nonlinear programming
[14,15] with various applications including scheduling [8], network optimization
[21], cryptography [10] and finance [9,20] etc. This algorithm is based on con-
tinuous representation techniques for integer set, exact penalty theorem, DCA
and Branch-and-Bound algorithms. Recently, the author developed a parallel
branch-and-bound framework [17] in order to use the power of multiple CPU
and GPU for improving the performance of DCABB.
The ILP model can be stated in standard matrix form as:
min{f (x) := c x : x ∈ S} (P )
where S = {x ∈ {0, 1}n : Ax = b}, c ∈ Rn , b ∈ Rm and A ∈ Rm×n . Let us

denote K the linear relaxation of S defined by K = {x ∈ [0, 1]n : Ax = b}. Thus,
we have the relationship between S and K as S = K ∩ {0, 1}n .
Let us denote R(P ) the linear relaxation of (P ) defined as
min{f (x) : x ∈ K}
whose optimal value l(P ) is a lower bound of (P ).

The continuous representation technique for integer set {0, 1}n consists of
finding a continuous DC function3 p : Rn → R such that
{0, 1}n ≡ {x : p(x) ≤ 0}.
We often use the following functions for p with their DC components:
Function type Expression of p DC components of p

n
Piecewise linear min{xi , 1 − xi } g(x) = 0, h(x) = −p(x)
i=1
n
Quadratic xi (1 − xi )
i=1
n 2
Trigonometric i=1 sin (πxi ) g(x) = π 2 x2 , h(x) = g(x) − p(x)
Based on the exact penalty theorem [5,11], there exists a large enough param-
eter t ≥ 0 such that the problem (P ) is equivalent to the problem (P t ):
min{Ft (x) := f (x) + tp(x) : x ∈ K}. (P t )
The objective function Ft : Rn → R in (P t ) is also DC with DC components

gt and ht defined as gt (x) = tg(x), ht (x) = th(x) − f (x) where g and h are DC
components of p. Thus the problem (P t ) is a DC program which can be solved
by DCA which is simply described in the next scheme:
xi+1 ∈ arg min{g(x) −

x, y i : x ∈ K}
with y i ∈ ∂h(xi ). The symbol ∂h(xi ) denotes the subdifferential of h at xi which

is fundamental in convex analysis. The subdifferential generalizes the derivative
in the sense that h is differentiable at xi if and only if ∂h(xi ) reduces to the
singleton {∇h(xi )}.
Concerning on the choice of the penalty parameter t, we suggest using the
following two methods: the first method is to take arbitrarily a large value for t;
the second one is to increase t by some ways in iterations of DCA (e.g., [14,20]).
Note that a smaller parameter t yields a better DC decomposition [16].
DCA often provides an integer solution for (P ) thus it is often served as an
upper bound algorithm. More details about DCA and its convergence theorem
can be found in [7]. Due to the length limitation of the proceeding paper, the
combination of DCA with a parallel-branch-and-bound algorithm (PDCABB)
proposed in [17] as well as its convergence theorem, branching strategies, parallel
node selection strategies ... will be discussed in our full-length paper.
3
A function f : Rn → R is called DC if there exist two convex functions g and h
(called DC components) such that f = g − h.
In this section, we present our experimental results for assessing the performance
of the sentence compression model described above.
Our sentence compression model is implemented in Python as a Natural
Language Processing package, called ‘NLPTOOL’ (actually supporting multi-
language tokenization, tagging, parsing, automatic CFG grammar generation,
and sentence compression), which implants NLTK 3.2.5 [19] for creating parsing
trees and Gurobi 8.1.0 [2] for solving the linear relaxation problems R(Pi ) and
the convex optimization subproblems in Step 2 of DCA. The PDCABB algorithm
is implemented in C++ and invoked in python. The parallel computing part in
PDCABB is realized by OpenMP.
4.1 F-score Evaluation

We use a statistical approach called F-score to evaluate the similarity between
the compression computed by our algorithm and a standard compression pro-
vided by human. Let us denote A as the number of words both in the compressed
result and the standard result, B as the number of words in the standard result
but not in the compressed result, and C as the number of words in the com-
pressed result but not in the standard result. Then F-score is defined by:
P ×R
Fµ = (μ2 + 1) ×
μ2 × P + R
A
where P and R represent for precision rate and recall rate as P = A+C ,R =
A
A+B . The parameter μ, called preference parameter, stands for the preference
between precision rate and recall rate for evaluating the quality of the results.
Fµ is a strictly monotonic function defined on [0, +∞[ with lim Fµ = P and
µ→0
lim Fµ = R. In our tests, we will use F1 as F-score. Clearly, a bigger F-score
µ→+∞
indicates a better compression.

Table 1 illustrates the compression result of 100 sentences obtained by two ILP
compression models: our new hybrid model (H) v.s. the probability model (P).
Penn Treebank corpus (Treebank) provided in NLTK and CLwritten corpus
(Clarke) provided in [1] are used for sentence compression. We applied Kneser-
Ney Smoothing for computing trigram probabilities. The compression rates4 are
given by 50%, 70% and 90%. We compare the average solution time and the aver-
age F-score for these models solved by Gurobi and PDCABB. The experiments
are performed on a laptop equipped with 2 Intel i5-6200U 2.30GHz CPU (4
cores) and 8 GB RAM. It can be observed that our hybrid model often provides
4
The compression rate is computed by the length of compression over the length of
original sentence.
better F-scores in average for all compression rates, while the computing time
for both Gurobi and PDCABB are all very short within less than 0.2 s. We can
also see that Gurobi and PDCABB provided different solutions since F-scores
are different. This is due to the fact that branch-and-bound algorithm find only
approximate global solutions when the gap between upper and lower bounds is
small enough. Even both of the solvers provide global optimal solutions, these
solutions could be also different since the global optimal solution for ILP could
be not unique. However, the reliability of our judgment can be still guaranteed
since these two algorithms provided very similar F-score results.
Table 1. Compression results
Corpus+Model Solver 50% compression rate 70% compression rate 90% compression rate
F-score (%) Time (s) F-score (%) Time (s) F-score (%) Time (s)
Treebank+P Gurobi 56.5 0.099 72.1 0.099 79.4 0.081
PDCABB 59.1 0.194 76.2 0.152 80.0 0.122
Treebank+H Gurobi 79.0 0.064 82.6 0.070 81.3 0.065
PDCABB 79.9 0.096 82.7 0.171 82.1 0.121
Clarke+P Gurobi 70.6 0.087 80.2 0.087 80.0 0.071
PDCABB 81.4 0.132 80.0 0.128 81.2 0.087
Clarke+H Gurobi 77.8 0.046 85.5 0.052 82.4 0.041
PDCABB 79.9 0.081 85.2 0.116 82.3 0.082
The box-plots given in Fig. 2 demonstrates the variations of F-scores for

different models with different corpora. We observed that our hybrid model
(Treebank+H and Clarke+H) provided better F-scores in average and is more
stable in variation, while the quality of the compressions given by probability
model is worse and varies a lot. Moreover, the choice of corpora will affect the
Fig. 2. Box-plots for different models v.s. F-scores

compression quality since the trigram probability depends on corpora. Therefore,

in order to provide more reliable compressions, we have to choose the most
related corpora to compute the trigram probabilities.

We have proposed a hybrid sentence compression model ILP-PT based on the
probability model and the parse tree model to guarantee the syntax correctness
of the compressed sentence and save the main meaning. We use a DC program-
ming approach PDCABB to solve our sentence compression model. Experimental
results show that our new model and the solution algorithm can produce high
quality compressed results within a short compression time.
Concerning on future works, we are very interested in designing a suitable
recurrent neural network for sentence compression. With deep learning method,
it is possible to classify automatically the sentence types and fundamental struc-
tures, it is also possible to distinguish the fixed collocation in a sentence and
make these variables be remained or be deleted together. Researches in these
directions will be reported subsequently.
References
1. Clarke, J., Lapata, M.: Global inference for sentence compression: an integer linear
programming approach. J. Artif. Intell. Res. 31, 399–429 (2008)
2. Gurobi 8.1.0. http://www.gurobi.com
3. Jing, H.: Sentence reduction for automatic text summarization. In: Proceedings of
the 6th Applied Natural Language Processing Conference, pp. 310–315 (2000)
4. Knight, K., Marcu, D.: Summarization beyond sentence extraction: a probalistic
approach to sentence compression. Artif. Intell. 139, 91–107 (2002)
5. Le Thi, H.A., Pham, D.T., Le Dung, M.: Exact penalty in dc programming. Viet-
nam J. Math. 27(2), 169–178 (1999)
6. Le Thi, H.A., Pham, D.T.: A continuous approach for large-scale constrained
quadratic zero-one programming. Optimization 45(3), 1–28 (2001)
7. Le Thi, H.A., Pham, D.T.: The dc (difference of convex functions) programming
and dca revisited with dc models of real world nonconvex optimization problems.
Ann. Oper. Res. 133, 23–46 (2005)
8. Le Thi, H.A., Nguyen, Q.T., Nguyen, H.T., et al.: Solving the earliness tardiness
scheduling problem by DC programming and DCA. Math. Balk. 23, 271–288 (2009)
9. Le Thi, H.A., Moeini, M., Pham, D.T.: Portfolio selection under downside risk mea-
sures and cardinality constraints based on DC programming and DCA. Comput.
Manag. Sci. 6(4), 459–475 (2009)
10. Le Thi, H.A., Minh, L.H., Pham, D.T., Bouvry, P.: Solving the perceptron problem
by deterministic optimization approach based on DC programming and DCA. In:
Proceeding in INDIN 2009, Cardiff. IEEE (2009)
11. Le Thi, H.A., Pham, D.T., Huynh, V.N.: Exact penalty and error bounds in dc
12. MacDonald, D.: Discriminative sentence compression with soft syntactic con-
straints. In: Proceedings of EACL, pp. 297–304 (2006)
13. Niu, Y.S., Pham, D.T.: A DC programming approach for mixed-integer linear
programs. In: Modelling, Computation and Optimization in Information Systems
and Management Sciences, CCIS, vol. 14, pp. 244–253 (2008)
14. Niu, Y.S.: Programmation DC & DCA en Optimisation Combinatoire et Optimi-
sation Polynomiale via les Techniques de SDP. Ph.D. thesis, INSA, France (2010)
15. Niu, Y.S., Pham, D.T.: Efficient DC programming approaches for mixed-integer
quadratic convex programs. In: Proceedings of the International Conference on
Industrial Engineering and Systems Management (IESM2011), pp. 222–231 (2011)
16. Niu, Y.S.: On difference-of-SOS and difference-of-convex-SOS decompositions for
polynomials (2018). arXiv:1803.09900
17. Niu, Y.S.: A parallel branch and bound with DC algorithm for mixed integer opti-
mization. In: The 23rd International Symposium in Mathematical Programming
(ISMP2018), Bordeaux, France (2018)
18. Nguyen, H.T., Pham, D.T.: A continuous DC programming approach to the strate-
gic supply chain design problem from qualified partner set. Eur. J. Oper. Res.
183(3), 1001–1012 (2007)
19. NLTK 3.2.5: The Natural Language Toolkit. http://www.nltk.org
20. Pham, D.T., Le Thi, H.A., Pham, V.N., Niu, Y.S.: DC programming approaches
for discrete portfolio optimization under concave transaction costs. Optim. Lett.
10(2), 261–282 (2016)
21. Schleich, J., Le Thi, H.A., Bouvry, P.: Solving the minimum m-dominating set
problem by a continuous optimization approach based on DC programming and
DCA. J. Comb. Optim. 24(4), 397–412 (2012)
Discrete Optimization and Network
Optimization
A Horizontal Method of Localizing Values
of a Linear Function in
Permutation-Based Optimization
Liudmyla Koliechkina1 and Oksana Pichugina2(B)

1
University of Lodz, Uniwersytecka Str. 3, 90-137 Lodz, Poland
liudmyla.koliechkina@wmii.uni.lodz.pl
2
National Aerospace University Kharkiv Aviation Institute, 17 Chkalova Street,
61070 Kharkiv, Ukraine
oksanapichugina1@gmail.com
Abstract. This paper is dedicated to linear constrained optimization

on permutation configurations’ set, namely, to permutation-based sub-
set sum problem (PB-SSP). To this problem, a directed structural graph
is associated connected with a skeleton graph of the permutohedron
and allowing to perform a directed search to solve this linear program.
To solve PB-SSP, a horizontal method for localizing values of a linear
objective function is offered combining Graph Theory tools, geometric
and structural properties of a permutation set mapped into Euclidean
space, the behavior of linear functions on the set, and Branch and Bound
techniques.
Keywords: Discrete optimization · Linear constrained optimization ·

Combinatorial configuration · Permutation · Skeleton graph ·
Grid graph · Search tree
1 Introduction
Combinatorial optimization problems (COPs) with permutation as candidate
solutions commonly known as permutation-based problems [13] can be found in
a variety of application areas such as balancing problems associated with chip
design, ship loading, aircraft outfitting, turbine balancing as well as in geomet-
ric design, facility layout, VLSI design, campus design, assignments, schedul-
ing, routing, scheduling, process communications, ergonomics, network analysis,
cryptography, etc. [4,9,10,13–15,19,23–25,27,34,35].
Different COPs are representable easily by graph-theoretic approach (GTA)
[1–3,6–8,11]. First of all, it concerns COP on a set E coinciding with a vertex set
of their convex hull P (vertex-located sets, VLSs [28,30]). Such COPs are equiv-
alent to optimization problems on a node set of a skeleton graph G = (E, E)
of the polytope P , where E is an edge set of P . Note that in case if E is not
a VLS, approaches to an equivalent reformulation of the COP as an optimiza-
tion problem on a VLS in higher dimensional space can be applied first [31,32].
https://doi.org/10.1007/978-3-030-21803-4_36
356 L. Koliechkina and O. Pichugina
The benefits of using the graph-theoretic approach are not limited to simple
illustrations, but also provide an opportunity to develop approaches to solving
COPs based on using configuration graphs [11] and structural graphs [1–3,6–8]
of the problems. Localization of COP-solutions or values of the objective func-
tion is an interesting technique allowing to reduce considerably a search domain
based on deriving specifics of the domain, type of constraints and the objec-
tive function [1,3,6,7,29]. In particular, the method of ordering the values of
objective function on an image En (A) in Rn of a set of n-permutations induced
by a set A is considered in [1]. It consists in constructing a Hamiltonian path
in an E of a permutation graph, which is a skeleton graph of the permutohe-
dron Pn (A) = conv(En (A)). In [2], a similar problem is considered on an image
Enk (A) in Euclidean space of a set multipermutations induced by a multiset A
including k different elements. In this case, a skeleton graph of the generalized
permutohedron (the multipermutation graph) is considered instead of the per-
mutation graph. In [8], linear constrained single and multiobjective COPs on
Enk (A) are solved using the multipermutation graph, etc.
This paper is dedicated to developing GTA-techniques [1–3,6–8] for solving
permutation-based COPs (PBOPs) related to localization of objective function
values. Namely, a generalization of Subset Sum Problem (SSP) [5,12], which as
known NP-complete COP, from the Boolean set Bn as an admissible domain to
En (A) (further referred to as a permutation-based SSP, BP-SSP). Also, we will
consider versions of BP-SSP where a feasible solution x∗ is sought (BP-SSP1)
or a complete solution X ∗ is sought (BP-SSP2).
2 The Combinatorial Optimization Problem: Statement

and Properties
In the general form, COP can be formulated as follows: there is set A of n
elements
A = {a1 , a2 , . . . , ak } ⊂ R1 , such that a1 < · · · < ak (1)
on which a finite point configuration E = {e1 , e2 , . . . , eN } ⊂ Rn is given and
function f (x) : E → R1 . By an e-configurations e ∈ E [26,30], one can under-
stand a permutation, a partial permutation, a combination, a partition, a com-
position, a partially ordered set induced by A, etc. and considered as a point in
Rn . It is required to find an extremum z ∗ (maximum or minimum) of f (x) and
an extremal x∗ or a set X ∗ of the extremals, where the extremum is attained,
and additional constraints are satisfied (further referred to as COP1/COP2,
respectively). Thus, their formulations are: find
COP1 : z ∗ = extr f (x), x∗ = argextr f (x);

x∈E x∈E
COP2 : z ∗ = extr f (x), X ∗ = Argextr f (x),
x∈E x∈E
where E = {x ∈ E : fi (x) ≤ 0, i ∈ Jm },
Jm = {1, . . . , m}.
A Horizontal Method of Localizing Values in Permutation Optimization 357
A permutation-based COP (PB-COP) is a particular case of COP, where

E ∈ {Πn (A), Πnk (A), En (A), Enk (A)}. Here, Πn (A) is a set of n-permutations
induced by a set A = {ai }i∈Jn : ai < ai+1 , i ∈ Jn−1 En (G) ⊂ Rn , and En (A) - is
an image of Πn (A) in Euclidean space; E = Enk (G) - is is an image in Euclidean
space of n-multipermutation set Πnk (A) induced by set (1), k < n. Denoting
Enn (G) = En (G) these two are united in a class Enk (G) – the generalized set
of e-configuration of permutations [16,17,19,26,27].
E = En (G) has many interesting constructive and geometric peculiarities
[1–3,16–19,21,26–28,30–32,34–36], e.g.,
xmax = argmaxf (x) = (ai )i∈Jn ; xmin = argminf (x) = (an−i+1 )i∈Jn ,
x∈E x∈E
if f (x) = cT x, c = 0, c1 ≤ · · · ≤ cn ; (2)
– Xmin = Argminf (x)/Xmax = Argmaxf (x) is obtained from xmin /xmax by
x∈E x∈E
permuting coordinates within sets of coordinates with the same coefficient of
f (x), wherefrom
if c1 < · · · < cn ⇒ Xmin = xmin , Xmax = xmax ; (3)
– E is VLS;
– E is inscribed in a hypersphere Sr (b) centered at b = (b, . . . , b) ∈ Rn (b ∈ R1 );
– ∀i ∈ Jn , E lies on n parallel hyperplaines Hi = {Hij }i∈Jn : Hij = {x ∈ Rn :
xi = aj }, j ∈ Jn . As a result,
∀i ∈ Jn E = ∪ Eij , (4)
j∈Jn
where Eij = E ∩ Hij En−1 (Jn−1 ), i, j ∈ Jn ;

– P = convE = Pn (A) is a permutohedron, which is a simple polytope and its
H-presentation is:
⎧ n n
⎪
⎪ xi = ai ;
⎨
i=1 i=1
|ω|
(5)
⎪
⎪
⎩ xi ≥ ai , ∀ω ⊂ Jn ;
i∈ω i=1
– a skeleton graph Gn (A) of the permutohedron Pn (A) has all permutations

induced by A as a node set. Its adjacent vertices are differ by adjacent trans-
position (i.e., (ai , ai+1 )-transposition);
– any function f : E → R1 allows extending is a convex way onto arbitrary
convex set K ⊃ E;
– E can be represented analytically in the following ways: (a) by equation of
Sr (b) and (5);

(b) xi = gi , j ∈ Jn ;
ω⊆Jn ,|ω|=j i∈ω ω⊆Jn ,|ω|=j i∈ω

n
n
(c) xji = gij , j ∈ Jn .
i=1 i=1
– if x ∈ E is formed from y ∈ E by a single transposition ai ↔ aj , i < j, then

∀c ∈ Rn cT x ≤ cT x iff ci ≤ cj (further referred to as Rule1).
Let us consider the following versions of PB-COP: find a solution of
permutation-based versions of COP1/COP2 with (2) as objective function,
E = En (A);
f1 (x) = f (x) − z0 ≤ 0; f2 (x) ≤ −f (x) + z0 ≤ 0, (6)
where z0 ∈ R1 (further referred to as PB-COP1/PB-COP2).

Note that (6) can be rewritten as follows:
f (x) = z0 ,
wherefrom PB-COP1, PB-COP2 are permutation-based feasibility problems of

finding a point x0 in E or the whole set E , respectively.
PB-COP1, PB-COP2 are both at least as hard as NP-complete problems
since the subset sum problem – given a real set A on n elements, is there its
m-element (m < n) subset whose sum is z0 – is a particular case of PB-COP1,
where c1 = · · · = cn−m = 0, cn−m+1 = · · · = cn = 1.
3 The Horizontal Method for PB-COP2

Let us introduce a horizontal method for solving PB-COP2 (PB-COP2.HM),
which is based on applying the Branch and Bound paradigm to this feasibility
problem.
To solve PB-COP2, a search tree with a root E = En (A) and leaves
{E3 (A)}A⊂A,|A|=3 is build. It uses GTA and the listed properties of En (A)
and Gn (A). In particular, it applies the decomposition (4) recursively fixing last
coordinates in En (A)-subsets being the tree nodes; the explicit solution (2) of a
linear COP; the vertex locality of E; the adjacency criterion, Rule1, and so on.
Let us introduce some notations. cd = (cdi )i∈Jlcd – is a code of the object
[.] ∈ {E, G, G}, which is a partial lcd -permutation from A, lcd ∈ Jn−1 0
–
is the length of the code (Jm = Jm ∪ {0}). cd defines values of consecutive
0
last lcd -positions of [.](cd).

E(∅) = En (A), G(∅) = Gn (A), G(∅) = Gn (A) is a grid-graph, which will be
defined later.
Branching of [.](cd) is based on (4) and performed according a rule:
branch([.](cd)) = {[.](cdi )i∈Jn−lcd },
where cdi = (ai (cd), cd), i ∈ Jn−lcd , A(cd) = {ai (cd)}i∈Jlcd = A\{cdi }i∈Jlcd ,
ai (cd) ≤ ai+1 (cd), i ∈ Jn−lcd −1 .
Estimates are found with respect to (2) and taking into account already
fixed coordinates, namely,
lb([.](cd)) = zmin (cd))/ub([.](cd)) = zmax (cd) –
is a lower/upper bound on the branch [.](cd), where zmin (cd) = f (ymin (cd)),
zmax (cd) = f (ymax (cd)).
G(cd) – is a skeleton graph of conv(E(cd)), G(cd) is a directed grid-graph
shown on Fig. 1. G(cd) is of 2(n − lcd ) nodes, two of which have been examined,
namely, top-left and bottom-right ones: zmax (cdn−lcd ) = zmin (cd), zmin (cd1 ) =
zmin (cd). In the terminology of [8], G(cd) is a two-dimensional structural graph
of BP-COP2.
Fig. 1. The grid-graph G(cd) Fig. 2. The graph (6, 4, 2)
Fig. 3. The grid-graph G(∅)
Pruning branches:
– if z0 > zmax (cd) or z0 < zmin (cd), then prune E(cd) (a rule PB1);
– if z0 = zmax (cd), then find Xmax (cd), upload X ∗ : X ∗ = X ∗ ∪Ymax (cd), where
Ymax (cd) = {(x, cd)}x∈Xmax (cd) , and prune E(cd) (a rule PB2);
– if z0 = zmin (cd), then find Xmin (cd), upload X ∗ : X ∗ = X ∗ ∪ Ymin (cd), where
Ymin (cd) = {(x, cd)}x∈Xmin (cd) , and prune E(cd) (a rule PB3).
Fig. 4. The grid-graph G(2)
Fig. 5. The grid-graph G(4, 2)
By construction, in a column of the grid G(cd), consecutive nodes differ by

an adjacent transposition that enforce the following (with respect to Rule1):
zmax (cdn−lcd ) ≥ zmax (cdn−lcd −1 ) ≥ · · · ≥ zmax (cd1 );
zmin (cdn−lcd ) ≥ zmin (cdn−lcd −1 ) ≥ · · · ≥ zmin (cd1 );
zmin (cdi ) ≤ zmax (cdi ), i ∈ Jn−lcd ;
– if i ∈ Jn−lcd −1 : z0 > zmax (cdi ), then prune E(cdi ), . . . , E(cd1 );

– if i ∈ Jn−lcd \{1}: z0 < zmin (cdi ), then prune E(cd1 ), . . . , E(cdi ).
Remark 1. If PB-COP1 needs to be solved, the PB-COP2.HM-scheme is used
until a first admissible solution is found.
This version of PB-COP2.HM is directly generalized from En (A) to Enk (A).
The only difference is that (4) becomes ∀i ∈ Jn E = ∪ Eij , where Eij =
j∈Jk
E ∩ Hij , i ∈ Jn – is a set in the class En−1,k(Bj ) (Bj ), j ∈ Jk .
Another generalization of PB-COP2 concerns considering
f1 (x) = f (x) − z0 ≤ 0; f2 (x) ≤ −f (x) + z0 − Δ ≤ 0,
where Δ ≥ 0 instead of (6). Here, minor modifications in estimates’ rules are

required.
Simultaneously with forming X ∗ , a permutation-based COP of optimizing
ϕ : E → R1 can be found both single-objective as multiobjective [6–8,21,22].
4 BP.COP2 Example
Solve BP.COP2 with n = 6, c = (2, 3, 4, 6, 7, 8), A = J6 , z0 = 109.
Coefficients of c are different, therefore, by (2)–(3), rules PB2, PB3 are sim-
plified to:
– if z0 = zmax (cd), then X ∗ = X ∗ ∪ (xmax (cd), cd) (a rule PB2’);
– if z0 = zmin (cd), then X ∗ = X ∗ ∪ (xmin (cd), cd) (a rule PB3’).
Step 1. cd = X 0 = ∅, lcd = 0.
xmin (cd) = xmin = (6, 5, 4, 3, 2, 1), xmax (cd) = xmax = (1, 2, 3, 4, 5, 6),
zmin (cd) = 83 < z0 = 109 < zmax (cd) = 127. The branch E(cd) is not dis-
carded. branch(E(cd)) = {E(i)i∈J6 }. Graph G(∅) is depicted on Fig. 3 with E(6)
on top and E(1) on bottom, as well as with bounds lb(E(i)), ub(E(i)), i ∈ Jn .
ub(E(1)) = 109, hence, by PB2’, X ∗ = X ∗ ∪ (xmax (cd), cd) = {(2, 3, 4, 5, 6, 1)}

and the branch E(1) is discarded;
Step 2. Explore E(2). cd = (2), lcd = 1, branch(E(cd)) = {E(i, 2)i∈A(cd) },
where A(cd) = A\{2}. G(cd) = G(2) is shown on Fig. 4. It is seen, the branches
E(1, 2), E(2, 2) are pruned by PB1.
Step 3. Explore consecutively E(4, 2), E(5, 2), E(6, 2). Here, lcd = 2,
cd ∈ {(4, 2), (5, 2), (6, 2)}. Branching is performed into n − lcd = 4 branches
(see graphs G(cd) on Figs. 5–7).
Step 3.a. In E(4, 2), prune branches E(1, 4, 2), E(3, 4, 2) by PB1. By PB2’,
X ∗ = X ∗ ∪ (xmax (5, 4, 2), (5, 4, 2)) = X ∗ ∪ {(1, 3, 6, 5, 4, 2)}, and the branch
E(5, 4, 2) is discarded (see Fig. 5). The branch E(6, 4, 2) is explore analyzing
four vertices of G(6, 4, 2) as shown on Fig. 3. As a result, two new feasible
points are found x1 (6, 4, 2) = (3, 1, 5), x2 (6, 4, 2) = (1, 5, 3) are found, therefore
X ∗ = X ∗ ∪ (xi (cd), cd)i=1,2 = X ∗ ∪ {(3, 1, 5, 6, 4, 2), (1, 5, 3, 6, 4, 2)}.
Step 3.b. In E(5, 2), prune branches E(3, 5, 2), E(1, 5, 2) by PB1 (see Fig. 6).
E(4, 5, 2), E(6, 5, 2) are analyzed similarly to E(6, 4, 2). As a result, one more
feasible solution is found and X ∗ = X ∗ ∪ {(3, 4, 1, 6, 5, 2)}.
Step 3.c. In E(6, 2), prune a branch E(1, 6, 2) by PB1. By PB3’,
X ∗ = X ∗ ∪ (xmin (5, 6, 2), (5, 6, 2)) = X ∗ ∪ {(4, 3, 1, 5, 6, 2)}, and the branch
E(5, 6, 2) is discarded (see Fig. 7). Exploring E(4, 6, 2), E(3, 6, 2) like E(6, 4, 2),
we get that X ∗ is complemented by new admissible solution, namely, X ∗ =
X ∗ ∪ {(1, 5, 4, 3, 6, 2)}.
By now, a third of E have been examined. For that, about 50 points from
E = 720 were analyzed and 7 elements of E were found. The set E contains 26
points, implying that about a third of E has been found. The same proportion
holds for the rest branches E(1) − E(3). As a result, around p = 20% points of
E are analyzed to get E .
PB-COP2.HM was implemented for PB-COP2s of dimensions up to 200.
Numerical results demonstrated that the percentage p decreases as n increases.
Also, p decreases as far as z0 becomes closer to zmin or zmax to the middle value
(zmin + zmax )/2.
5 Conclusion
Complex extreme and feasibility combinatorial problems on the set of permuta-

tions En (A) have been investigated by means of its embedding into Euclidean
space and associating and utilizing the permutation polytope, permutation
graph, and grid graph. For the problem PB-SSP, the horizontal method for
localizing values of a linear objective function (PB-COP2.HM) is developed,
and directions of its generalization to solve a wide class of permutation-based
problems, which are formalized as linear combinatorial programs, are outlined.
(PB-COP2.HM) is supported by an example and illustrations.
References
1. Donec, G.A., Kolechkina, L.M.: Construction of Hamiltonian paths in graphs of
permutation polyhedra. Cybern. Syst. Anal. 46(1), 7–13 (2010). https://doi.org/
10.1007/s10559-010-9178-1
2. Donec, G.A., Kolechkina, L.M.: Extremal Problems on Combinatorial Configura-
tions. RVV PUET, Poltava (2011)
3. Donets, G.A., Kolechkina, L.N.: Method of ordering the values of a linear function
on a set of permutations. Cybern. Syst. Anal. 45(2), 204–213 (2009). https://doi.
org/10.1007/s10559-009-9092-6
4. Gimadi, E., Khachay, M.: Extremal Problems on Sets of Permutations. Ural Federal
University, Yekaterinburg (2016). [in Russian]
5. Kellerer, H., Pferschy, U., Pisinger, D.: Knapsack Problems. Springer, Berlin, New
York (2010)
6. Koliechkina, L.M., Dvirna, O.A.: Solving extremum problems with linear fractional
objective functions on the combinatorial configuration of permutations under mul-
ticriteriality. Cybern. Syst. Anal. 53(4), 590–599 (2017). https://doi.org/10.1007/
s10559-017-9961-3
7. Koliechkina, L.N., Dvernaya, O.A., Nagornaya, A.N.: Modified coordinate method
to solve multicriteria optimization problems on combinatorial configurations.
Cybern. Syst. Anal. 50(4), 620–626 (2014). https://doi.org/10.1007/s10559-014-
9650-4
8. Koliechkina, L., Pichugina, O.: Multiobjective Optimization on Permutations with
Applications. DEStech Trans. Comput. Sci. Eng. Supplementary Volume OPTIMA
2018, 61–75 (2018). https://doi.org/10.12783/dtcse/optim2018/27922
9. Kozin, I.V., Maksyshko, N.K., Perepelitsa, V.A.: Fragmentary structures in dis-
crete optimization problems. Cybern. Syst. Anal. 53(6), 931–936 (2017). https://
doi.org/10.1007/s10559-017-9995-6
10. Korte, B., Vygen, J.: Combinatorial Optimization: Theory and Algorithms.
Springer, New York (2018)
11. Lengauer, T.: Combinatorial Algorithms for Integrated Circuit Layout.
Vieweg+Teubner Verlag (1990)
12. Martello, S., Toth, P.: Knapsack Problems: Algorithms and Computer Implemen-
tations. Wiley, Chichester, New York (1990)
13. Mehdi, M.: Parallel Hybrid Optimization Methods for permutation based problems
(2011). https://tel.archives-ouvertes.fr/tel-00841962/document
14. Pichugina, O.: Placement problems in chip design: Modeling and optimization. In:
2017 4th International Scientific-Practical Conference Problems of Infocommuni-
cations. Science and Technology (PIC S&T). pp. 465–473 (2017). https://doi.org/
10.1109/INFOCOMMST.2017.8246440
15. Pichugina, O., Farzad, B.: A human communication network model. In: CEUR
Workshop Proceedings, pp. 33–40. KNU, Kyiv (2016)
16. Pichugina, O., Yakovlev, S.: Convex extensions and continuous functional repre-
sentations in optimization, with their applications. J. Coupled Syst. Multiscale
Dyn. 4(2), 129–152 (2016). https://doi.org/10.1166/jcsmd.2016.1103
17. Pichugina, O.S., Yakovlev, S.V.: Functional and analytic representations of the
general permutation. East. Eur. J. Enterp. Technol. 79(4), 27–38 (2016). https://
doi.org/10.15587/1729-4061.2016.58550
18. Pichugina, O.S., Yakovlev, S.V.: Continuous representations and functional exten-
sions in combinatorial optimization. Cybern. Syst. Anal. 52(6), 921–930 (2016).
https://doi.org/10.1007/s10559-016-9894-2
19. Pichugina, O., Yakovlev, S.: Optimization on polyhedral-spherical sets: Theory

and applications. In: 2017 IEEE 1st Ukraine Conference on Electrical and Com-
puter Engineering, UKRCON 2017-Proceedings, pp. 1167–1174. KPI, Kiev (2017).
https://doi.org/10.1109/UKRCON.2017.8100436
20. Schrijver, A.: Combinatorial Optimization: Polyhedra and Efficiency. Springer,
Berlin, New York (2003)
21. Semenova, N.V., Kolechkina, L.M., Nagirna, A.M.: Multicriteria lexicographic opti-
mization problems on a fuzzy set of alternatives. Dopov. Nats. Akad. Nauk Ukr.
Mat. Prirodozn. Tekh. Nauki. (6), 42–51 (2010)
22. Semenova, N.V., Kolechkina, L.N., Nagornaya, A.N.: On an approach to the solu-
tion of vector problems with linear-fractional criterion functions on a combinatorial
set of arrangements. Problemy Upravlen. Inform. 1, 131–144 (2010)
23. Sergienko, I.V., Kaspshitskaya, M.F.: Models and Methods for Computer Solu-
tion of Combinatorial Optimization Problems. Naukova Dumka, Kyiv (1981). [in
Russian]
24. Sergienko, I.V., Shilo, V.P.: Discrete Optimization Problems: Challenges. Methods
of Solution and Analysis. Naukova Dumka, Kyiv (2003). [in Russian]
25. Stoyan, Y.G., Yakovlev, S.V.: Mathematical Models and Optimization Methods of
Geometrical Design. Naukova Dumka, Kyiv (1986). [in Russian]
26. Stoyan, Y.G., Yakovlev, S.V., Pichugina O.S.: The Euclidean Combinatorial Con-
figurations: A Monograph. Constanta (2017). [in Russian]
27. Stoyan, Y.G., Yemets, O.O.: Theory and Methods of Euclidean Combinatorial
Optimization. ISSE, Kyiv (1993). [in Ukrainian]
28. Yakovlev, S.: Convex Extensions in Combinatorial Optimization and Their Appli-
cations. Optim. Methods Appl. 567–584. Springer, Cham (2017). https://doi.org/
10.1007/978-3-319-68640-0 27
29. Yakovlev, S.V., Grebennik, I.V.: Localization of solutions of some problems of
nonlinear integer optimization. Cybern. Syst. Anal. 29(5), 727–734 (1993). https://
doi.org/10.1007/BF01125802
30. Yakovlev, S.V., Pichugina, O.S.: Properties of combinatorial optimization problems
over polyhedral-spherical sets. Cybern. Syst. Anal. 54(1), 99–109 (2018). https://
doi.org/10.1007/s10559-018-0011-6
31. Yakovlev, S., Pichugina, O., Yarovaya, O.: On optimization problems on the
polyhedral-spherical configurations with their properties. In: 2018 IEEE First Inter-
national Conference on System Analysis Intelligent Computing (SAIC), pp. 94–100
(2018). https://doi.org/10.1109/SAIC.2018.8516801
32. Yakovlev, S.V., Pichugina, O.S., Yarovaya, O.V.: Polyhedral spherical configura-
tion in discrete optimization. J. of Autom. Inf. Sci. 51, 38–50 (2019)
33. Yakovlev, S., Pichugina, O., Yarovaya, O.: Polyhedral spherical configuration in
discrete optimization. J. of Autom. Inf. Sci. 51(1), 38–50 (2019)
34. Yakovlev, S.V., Valuiskaya, O.A.: Optimization of linear functions at the vertices of
a permutation polyhedron with additional linear constraints. Ukr. Math. J. 53(9),
1535–1545 (2001). https://doi.org/10.1023/A:1014374926840
35. Yemelichev, V.A., Kovalev, M.M., Kravtsov, M.K.: Polytopes. Graphs and Opti-
misation. Cambridge University Press, Cambridge (1984)
36. Ziegler, G.M.: Lectures on Polytopes. Springer, New York (1995)
An Experimental Comparison of Heuristic
Coloring Algorithms in Terms of Found Color
Classes on Random Graphs
Deniss Kumlander(&) and Aleksei Kulitškov
Tallinn University of Technology, Ehitajate tee 5, 19086 Tallinn, Estonia

{deniss.kumlander,aleksei.kulitskov}@ttu.ee
Abstract. Well-known graph theory problems are graph coloring and finding
the maximum clique in an undirected graph, or shortly - MCP. And these
problems are closely related. Vertex coloring is usually considered an initial step
before the start of finding maximum clique of a graph. The maximum clique
problem is considered to be of NP-hard complexity, which means that there is
no algorithm found that could solve this kind of problem in polynomial time.
The maximum clique algorithms employ a lot the heuristic vertex coloring
algorithm to find bounds and estimations. One class of such algorithms executes
the coloring one only in the first stage, so those algorithms less concerned on the
performance of the heuristic and more on the discovered colors. The researchers
always face a problem, which heuristic vertex coloring algorithm should be
selected to improve the performance of the core algorithm. Here we tried to give
a lot of insights on existing heuristic vertex coloring algorithms and compare
them identifying their ability to find color classes - 17 coloring algorithms are
investigated: described and tested on random graphs.
Keywords: Graph theory Vertex coloring Heuristic
1 Introduction
Let G = (V, E) be an undirected graph. Then, V is a finite set of elements called vertices
and E is a finite set of unordered pairs of vertices, called edges. The cardinality of a set
of vertices, or just the number of its elements, is called the order of a graph and is
denoted as n = |V|. The cardinality of a set of edges, or just the number of its edges, is
called the size of a graph and is denoted as m = |E| [1]. If vi and vj are vertices that
belong to one and the same graph and there is a relationship between them, which ends
up being an edge, then these vertices are adjacent. The degree of vertex v in graph G is
the number of edges incident to it [1]. Or, in other words, it is the number of this
vertex’s neighbors, which are connected to it. The maximum degree of a vertex in a
graph is the number of edges of a vertex with the maximum neighbors. The minimum
degree of a vertex in a graph is the number of edges of a vertex, which has the least
neighbors. Usually, the degree of a vertex is denoted as deg(v). Density is the ratio of
the edges in graph G to the number of vertices of the graph. It is defined as g(G).
A graph is considered to be complement if it has the same vertices as graph G and any

https://doi.org/10.1007/978-3-030-21803-4_37
366 D. Kumlander and A. Kulitškov
two vertices in this graph are adjacent only if the same vertices are nonadjacent in the
original graph. A simple graph is considered to be an undirected graph with finite sets
of vertices and edges, which has no loops or multiple edges. A subgraph G0 ¼ ðV 0 ; E0Þ
is considered to be a subset of the vertices of graph G with the corresponding edges.
But not all possible edges may be included. This means that if vertices vi and vj are
adjacent in graph G, then it may happen that on a subgraph of graph G they won’t have
an edge between them. V 0 V; E 0 E. An induced subgraph G0 ¼ ðV 0 ; E 0 Þ is considered
to be a subset of the vertices of graph G with all their corresponding edges.
G½V 0 ¼ ðV 0 V; E 0 ¼ f vi ; vj ji 6¼ j; vi ; vj 2E; vi ; vj 2V 0 gÞ. A complete subgraph G0 ¼
ðV 0 ; E0 Þ is considered to be a subset of the vertices of graph G with all their corre-
sponding edges, where each pair of vertices is connected by an edge. Clique is a
complete subgraph of graph G. The clique V 0 in graph G is called maximal if there does
not exist any other V00, such that V 0 V00. The size of the largest maximum clique in
graph G is called the clique number [1]. An independent set (IS) of a graph G is any
subset of vertices V 0 V, where vertices are not pairwise adjacent. So, it is not hard to
conclude that for any clique in graph G, there is an independent set in a complement
graph G0 and vice versa. The assignment of colors to vertices of a graph according to
algorithm’s construction. If we have an undirected graph G ¼ ðV; EÞ, then the process
of colors assignment must follow the rules below:

• vi ; vj 2E; i 6¼ j

• cðvi Þ 6¼ c vj ; i 6¼ j
A color class is known to be a subset of vertices that belong to a certain color.

A chromatic number of a graph G is considered to be the smallest number of colors
needed to make a proper coloring of graph G [2]. Or we should say that it is the
smallest number k for which there exists a k-coloring of graph G [1]. It is usually
denoted by vðGÞ. An algorithm is considered to be heuristic if it finds an approximate
solution to the problem in acceptable time. Situation when vertices have the same
saturation degree is called tie. The main problem of the maximal independent set
(MIS), is to find a subset of vertices that are pairwise nonadjacent or, in other words,
none of the vertex in this set must be connected to other vertices of that set. The IS is
called maximal only if there are no vertices that could be added to the current maximal
independent set without ruining its structure. MIS problem is considered to be of NP-
hard complexity. A graph coloring problem, or GCP is to find the least possible number
of colors for coloring a particular graph. It means that any two vertices that have a
relationship must be colored differently. GCP is considered to be an NP-complete
problem. It has a lot in common with MIS problem, because all the vertices that share
the same color, or are in one color class, can be called an independent set.
An Experimental Comparison of Heuristic Coloring Algorithms 367
2 Coloring Algorithms
Many algorithms have been developed to solve the graph coloring problem heuristically.
But Greedy remains to be the basic algorithm to assign colors in a graph. It provides a
relatively good solution in a small amount of time. The order, in which the algorithm
colors the vertices, plays a major role in the process and heavily affects the quality of the
coloring. Therefore, there are many algorithms, which employ different ordering
heuristics to determine the order before coloring the vertices. These algorithms are
mostly based on Greedy but use additional vertex ordering to achieve better perfor-
mance. As a rule, they surpass Greedy in the number of colors used, producing better
results but taking more time to complete. The most popular ordering heuristics are:
• First-Fit ordering - the most primitive ordering existing. Assigns each vertex a
lowest possible color. This technique is the fastest amount ordering heuristics.
• Degree base ordering – uses a certain criteria to order the vertices and then
chooses the correct one to color. Uses a lot more time compared to First-Fit
ordering, but produces much better results in terms of the number of used colors.
There are many different degree ordering heuristics, but he most popular among
them are:
a. Random: colors the vertices of a graph in random order or according to random
degree function, i.e. random unique numbers given to every vertex;
b. Largest-First: colors the vertices of a graph in order of decreasing degree, i.e. it
takes into account the number of neighbors of each vertex;
c. Smallest-Last: repeatedly assigns weights to the vertices of a graph with the
smallest degree, and removes them from the graph, then colors the vertices
according to their weights in decreasing order [4];
d. Incidence: sequentially colors the vertices of a graph according to the highest
number of colored neighbors;
e. Saturation: iteratively colors the vertices of a graph by the largest number of
distinctly colored neighbors;
f. Mixed/Combined: uses a combination of known ordering heuristics. For
example, saturation degree ordering combined with largest first ordering, which
is used only to solve situations, when there is a tie, i.e. saturation degree of some
vertices is the same.
Sequential algorithms tend to do a lot of tasks that could have been executed simul-
taneously. That is why many popular algorithms have their parallel versions.
2.1 Sequential Algorithms
1. Greedy – classical algorithm introduced by Welsh and Powell in 1967 [3]. It iterates
over the vertices in a graph and assigns each vertex a smallest possible color, which
is not assigned to any adjacent vertex, i.e. no neighbor must share the same color.
2. Largest-First - Welsh and Powell also suggested an ordering for the greedy algo-
rithm called largest first. It is based on vertices’ degrees. The algorithm orders the
vertices according to the number of neighbors that each of them has and then starts
with the greedy coloring.
3. Largest-First V2 - It is a slightly modified version of Largest-First algorithm. In this
algorithm more than one vertex could be colored in each iteration, i.e. after coloring
the vertex with the largest number of neighbors, the algorithm also assigns the same
color to all the vertices, which follow the rules of coloring - no adjacent vertices
must share the same color, and, finally, it removes these vertices from the graph.
4. Largest-First V3 - Based on the second version we made a third edition of the
Largest-First algorithm. The main idea of the algorithm is the same as in V2,
however, this time there will be a reordering of vertices in each iteration, meaning
that if the vertex is removed from the graph, then its neighbor’s degree is decreased.
5. DSatur- This heuristic algorithm was developed by Daniel Brelaz in 1979 [5]. The
core idea of it is to order the vertices by their saturation degrees. If a tie occurs, then
the vertex is chosen by the largest number of uncolored neighbors. By assigning
colors to a vertex with the largest number of distinctly colored neighbors, DSatur
minimizes the possibility of setting an incorrect color [2].
6. DSatur V2 - another interesting version of DSatur [6]. At first, it finds a largest
clique of graph and assigns each a distinct color. Then, it just removes the newly
colored vertices from the graph. After this procedure, the algorithm executes as the
previous DSatur. The greedy algorithm takes a complement graph, finds the largest
independent set and colors it with a distinct color. Then, removes these vertices
from the graph and starts working as the first version of DSatur.
7. Incidence degree ordering (IDO) - This ordering was firstly introduced by Daniel
Brelaz [5] and was modified by Coleman and More in their work [7]. In one word, it
is a modification of the DSatur algorithm. The main principle of this heuristic is to
order vertices by decreasing number of the vertices’ colored neighbors. If a tie
occurs, it can be decided, which vertex is going to be chosen, by the usage of
random numbers. The coloring itself is done by the Greedy algorithm.
8. MinMax - The MinMax algorithm was introduced by Hilal Almara’Beh and Amjad
Suleiman in their work in 2012 [8]. The main function of this algorithm is to find
the maximum independent set, but it could be used for coloring purposes as well
because independent sets are color classes.
2.2 Mixed/Combined Algorithms
1. IDO-LDO - This algorithm is a combination of incidence degree ordering and

largest-first ordering heuristics. As a primary heuristic we use IDO. If a tie occurs,
then it will be decided, which vertex is going to be taken, by the largest number of
neighbors.
2. IDO-LDO-Random - another modified IDO algorithm. This time the random
numbers’ function was added to decide in a situation of a tie. At first, the algorithm
orders the vertices by the largest number of colored neighbors, then by the largest
number of neighbors and then, if there are two or more vertices with the exact same
details, the one with the largest random number is chosen.
3. LDO-IDO - This modification was introduced by Dr. Hussein Al-Omari and Khair
Eddin Sabri in their work in 2006 [9]. The basic heuristic for this algorithm is the
Largest-First. If a tie occurs, then the IDO heuristic decides, which vertex to take.
On the whole, this is almost the same algorithm as the Largest-First V3 with an IDO
function inside, in one word, the first ordering is being done by the largest number
of neighbors and then by the largest number of colored neighbors.
4. DSatur-LDO - This modification of the DSatur algorithm was also introduced by
Dr. Hussein Al-Omari and Khair Eddin Sabri in their work in 2006 [9]. The
algorithm works as DSatur but if a tie occurs, then Largest-First algorithm steps into
the action to solve the conflict. According to the results, this heuristic works a little
better than the original DSatur within the same amount of time.
5. DSatur-IDO-LDO - In this algorithm ties are resolved by Incidence Degree
Ordering at first, then the remaining ties are resolved by the Largest Degree
Ordering [10].
2.3 Parallel Algorithms
1. Jones and Plassmann algorithm - The algorithm was firstly proposed by Jones and
Plassmann in their work in 1993 [11]. The algorithm is based on the Lubys parallel
algorithm [12]. The core idea was to construct a unique set of weights at the
beginning that would be used throughout the algorithm itself. For example, random
numbers. Any conflict of the same random numbers is solved by the vertex number.
Each iteration the JP algorithm finds the independent set of a graph, i.e. all the
vertices, which weight is higher than the weight of the neighboring vertices, and
then assigns colors to these vertices using the Greedy algorithm. Every action is
done in parallel.
2. Jones and Plassmann V2 - Another version of JP algorithm was introduced by
William Hasenplaugh, Tim Kaler, Tao B. Schardl and Charles E. Leiserson in their
work in 2014 [4]. The idea behind the modification was to use recursion. The
algorithm orders the vertices in the order of function p, which generates random
numbers. It starts by partitioning the neighbors of each vertex into predecessors (the
vertices with larger priorities) and successors (the vertices with lower priorities) [4].
If there are no vertices in predecessors, then the algorithm begins coloring. It has a
helper function named JpColor, which uses recursion to color the vertices. The
color is chosen by collecting all the colors from the predecessors and choosing the
smallest possible (this is being done in the GetColor helper function). When the
vertex with the empty predecessors list is colored, the algorithm searches for
changes in this vertex successors list for vertices with counter equals to zero starts
coloring them. All this is done in parallel subtasks.
3. Parallel Largest-First - As a base algorithm for Parallel Largest-First is used JP
algorithm, but as the heuristic – Largest-First. The main difference is that instead of
weights system, used in JP, here they are replaced by finding the largest degree of
each vertex. However, random numbers are not removed.
4. Parallel Smallest-Last - The Smallest-Last heuristics was firstly introduced by
Matula in his work in 1972 [13]. The SL heuristic’s system of weights is more
sophisticated and complex. The algorithm uses two phases [14]: Weighting phase
and Coloring phase. The weighting phase begins by finding vertices that correspond
to the current smallest degree in the graph. These vertices are assigned the current
weight and removed from the graph. The degree of all the neighbors of deleted
vertices are decreased. All these steps are repeated until every vertex receives its
weight.
5. Non-Parallel Implementations - These algorithms include:
• Greedy From Parallel – non-parallel copy of Jones and Plassmann algorithm;
• Greedy V2 From Parallel – non-parallel copy of Jones and Plassmann V2
algorithm;
• Largest-First From Parallel – non-parallel copy of Parallel Largest-First
algorithm;
• Smallest-Last From Parallel – non-parallel copy of Parallel Smallest-Last.
3 Tests and Results
In this part we are going to conduct tests to determine the most acceptable coloring
algorithms that could be used later for the maximum clique algorithms. In this study we
focus on the number of used colors by density of the graph. The random graphs are
generated as described by Kumlander in 2005 [2].
3.1 Sequential Algorithms

As can be seen from the charts, every algorithm performs better than the Greedy
algorithm in terms of number of used colors almost on every density. However, for
example, on 40% density IDO and MinMax algorithms’ performance is similar to that
of the Greedy algorithm. Also it is worth mentioning that in some particular cases
Largest-First and Largest-First V2 algorithms used the same number of colors as the
Greedy algorithm, while taking more time to achieve this goal (Fig. 1).
80 Greedy
Number of used colors
70
60 LargestFirst
50 LargestFirstV2
40 LargestFirstV3
DSatur
Number of verƟces DSaturV2
Fig. 1. Randomly generated graphs tests’ results compared in used colors. Sequential
algorithms, density 10%.
65
Number of used colors Greedy
60
LargestFirst
55
LargestFirstV2
50
LargestFirstV3
45
350 360 370 380 390 400 410 420 430 440 450 DSatur
70 Greedy
60 LargestFirst
50 LargestFirstV2
LargestFirstV3
40
110 114 118 122 126 130 134 138 142 146 150 DSatur
The best results in terms of used colors among all the sequential algorithms pro-
duced DSatur, DSatur V2 and Largest-First V3. Their performance is much better
compared to the Greedy algorithm; however, it comes with a cost of taking more time
to complete (Figs. 2 and 3).
3.2 Combined Algorithms

At first sight, it seems that the results of combined algorithms are very similar to the
sequential ones. There are also three leading algorithms, which this time are: DSatur-
LDO, DSatur-IDO-LDO and LDO-IDO. Their results are much better than the Greedy
one. The higher the density is, the more similar performance of algorithms. On 10% to
70% density we can see clear division between these three algorithms and the rest.
However, at 80% density and higher the difference begins to vanish, although the lead
in the number of used colors remains (Fig. 4).
80 Greedy
70
IdoLdo
60
IdoLdoRandom
50
LdoIdo
40
2000 2150 2300 2450 2600 2750 2900 3050 3200 3350 3500 DSaturLdo
Number of verƟces DSaturIdoLdo
Fig. 4. Randomly generated graphs tests’ results compared in used colors. Combined
Furthermore, it is possible to clearly see a very strange behavior of IDO-LDO-

Random algorithm - it used more colors than the Greedy algorithm in some cases. This
behavior might be caused by the fact that random numbers are used during execution of
IDO-LDO-Random algorithm and should be investigated by a separate research
(Figs. 5 and 6).
70
65 Greedy
60 IdoLdo
55 IdoLdoRandom
50 LdoIdo
45 DSaturLdo
350 360 370 380 390 400 410 420 430 440 450 DSaturIdoLdo
Number of verƟces
When it comes to consumed time, then LDO-IDO clearly wins among these three,
although it is bigger than the same of the Greedy one.
3.3 Parallel Algorithms

It can be seen from the charts that Parallel Largest-First prevails almost in every
situation. Along with Parallel Largest- First it is necessary to mention the Parallel
Smallest-Last algorithm, however, it shows promising results only at higher densities
using almost the same amount of colors and at 90% density even outperforming
Parallel Largest-First algorithm. Parallel Jones and Plassmann and its second version
perform very similar to the Greedy algorithm, using less or more colors compared to
Greedy (Figs. 7, 8 and 9).
70
65 Greedy
60
IdoLdo
55
IdoLdoRandom
50
LdoIdo
45
40 DSaturLdo
110 114 118 122 126 130 134 138 142 146 150 DSaturIdoLdo
Number of verƟces
80
Greedy
75
70 ParallelJp
65
ParallelJpV2
60
55 ParallelLargestFirst
50
ParallelSmallestLast
45
20002150230024502600275029003050320033503500 GreedyFromParallel
Number of verƟces GreedyV2FromParallel
Fig. 7. Randomly generated graphs tests’ results compared in used colors. Parallel algorithms,
density 10%.
65 Greedy
60 ParallelJp
55 ParallelJpV2
ParallelSmallestLast
45
350 360 370 380 390 400 410 420 430 440 450 GreedyFromParallel
density 50%.
70
Greedy
65
60 ParallelJp
55 ParallelJpV2
45 ParallelSmallestLast
40
GreedyFromParallel
110 114 118 122 126 130 134 138 142 146 150
density 90%.
In terms of time used to complete the task, Parallel Smallest-Last demonstrates the
worst results. The performance of Parallel Largest-First is not far away from Parallel
Smallest-Last algorithm. The only thing that should be noted is the fact that on 30%,
50% and 80% density Parallel Largest-First algorithm’s execution time is very similar
to Parallel JP despite the fact that it uses largest first ordering.
4 Conclusion
Algorithms that showed the better results among others in their group in terms of
number of used colors. And these algorithms are:
• Among sequential: DSatur, DSatur V2 and Largest-First V3;
• Among combined: DSatur-LDO, DSatur-IDO-LDO and LDO-IDO;
• Among parallel: Parallel Largest-First and Parallel Smallest-Last.
References
1. Kubale, M.: Graph Colorings. American Mathematical Society, US (2004)
2. Kumlander, D.: Some practical algorithms to solve the maximum clique problem. Tallinn
University of Technology, Tallinn (2005)
3. Welsh, D.J.A., Powell, M.B.: An upper bound for the chromatic number of a graph and its
application to timetabling problems. Comput. J. 10(1), 85–86 (1967)
4. Hasenplaugh, W., Kaler, T., Schardl, T.B., Leiserson, C.E.: Ordering heuristics for parallel
graph coloring. In: Proceedings of the 26th ACM Symposium on Parallelism in Algorithms
and Architectures–SPAA’14, pp. 166–177 (2014)
5. Brelaz, D.: New methods to color the vertices of a graph. Commun. ACM 22(4), 251–256
(1979)
6. Andrews, P.S., Timmis, J., Owens, N.D.L., Aickelin, U., Hart, E., Hone, A., Tyrrell, A.M.:
Artificial Immune Systems. York, UK (2009)
7. Coleman, T.F., More, J.J.: Estimation of sparse Jacobian matrices and graph coloring
problems. SIAM J. Numer. Anal. 20, 187–209 (1983)
8. Almarabeh, H., Suleiman, A.: Heuristic algorithm for graph coloring based on maximum
independent set. J. Appl. Comput. Sci. Math. 6(13), 9–18 (2012)
9. Al-Omari, H., Sabri, K.E.: New graph coloring algorithms. J. Math. Stat. 2(4), 439–441
(2006)
10. Saha, S., Baboo, G., Kumar, R.: An efficient EA with multipoint guided crossover for bi-
objective graph coloring problem. In: Contemporary Computing: 4th International
Conference-IC3 2011, pp. 135–145 (2011)
11. Jones, M.T., Plassmann, P.E.: A parallel graph coloring heuristic. SIAM J. Sci. Comput. 14
(3), 654–669 (1993)
12. Luby, M.: A simple parallel algorithm for the maximal independent set problem.
SIAM J. Comput. 15(4), 1036–1053 (1986)
13. Matula, D.W., Marble, G., Isaacson, J.D.: Graph coloring algorithms. Academic Press, New
York (1972)
14. Allwright, J.R., Bordawekar, R., Coddington, P.D., Dincer, K., Martin, C.L.: A comparison
of parallel graph coloring algorithms. Technical Report SCCS-666 (1995)
Cliques for Multi-Term Linearization of
0–1 Multilinear Program for Boolean
Logical Pattern Generation
Kedong Yan1 and Hong Seo Ryoo2(B)

1
Department of Computer Science and Technology, School of Computer Science
and Engineering, Nanjing University of Science and Technology, 200 Xiaolingwei,
Xuanwu District, Nanjing 210094, Jiangsu, People’s Republic of China
yan@njust.edu.cn
2
School of Industrial Management Engineering, Korea University, 145 Anam-Ro,
Seongbuk-Gu, Seoul 02841, Republic of Korea
hsryoo@korea.ac.kr
Abstract. 0–1 multilinear program (MP) holds a unifying theory to

Boolean logical pattern generation. For a tighter polyhedral relaxation
of MP, this note exploits cliques in the graph representation of data
under analysis to generate valid inequalities for MP that subsume all
previous results and, collectively, provide a much stronger relaxation of
MP. A preliminary numerical study demonstrates strength and practical
benefits of the new results.
Keywords: Logical analysis of data · Pattern ·

0–1 multilinear programming · 0–1 polyhedral relaxation · Graph ·
Clique
1 Introduction and Background

Logical Analysis of Data (LAD) is a combinatorial optimization-based super-
vised learning methodology, and the key and bottleneck step in LAD is pattern
generation where a set of features and their negations are optimally combined
together to form knowledge/rule that distinguishes one type of data/observations
from the other(s). Without loss of generality, we consider the analysis of two
types of + and − data and denote by S • the index set of • type of data for
• ∈ {+, −}. Let S = S + ∪ S − . We assume S is duplicate and contradiction free
(such that S + ∩ S − = ∅) and that the data under analysis are described by n
Boolean attributes aj , j ∈ N := {1, . . . , n}. We let an+j = ¬aj for j ∈ N and
1
This work was supported by National Natural Science Foundation of China (Grant
Number: 61806095.)
2
Corresponding author. This research was supported by Basic Science Research Pro-
gram through the National Research Foundation of Korea (NRF) funded by the
Ministry of Education (Grant Number: 2017R1D1A1A02018729.)
https://doi.org/10.1007/978-3-030-21803-4_38
Cliques for Multi-term Linearization of 0–1 Multilinear Program 377
let N := {n + 1, . . . , 2n} and N := N ∪ N . Finally, for each data Ai , i ∈ S, we

denote by Aij the j-th attribute value of the data; such that Aij = 1 − Ai,n+j
for j ∈ N and Aij = 1 − Ai,j−n for j ∈ N . Last, since + and − patterns are
symmetric in definition, we present (most of) the material below in the context
of + pattern generation for convenience, without loss of generality.
To build a mathematical model for pattern generation, we introduce 0–1
indicator variables xj for j ∈ N and let

1, if attribute aj is involved in a pattern; and
xj =
0, otherwise.
For i ∈ S, we let
Ji := {j ∈ N | Aij = 0}.
Since the dataset is duplicate and contradiction free, all Ji ’s are unique and
|Ji | = n, ∀i ∈ S.
In [15], we showed that the 0–1 MP below holds a unifying theory to LAD
pattern generation:

(PG) : max ϕ+ (x) + l(x) ϕ− (x) = 0, x ∈ {0, 1}2n
where l(x) is a linear function and

ϕ• (x) = (1 − xj )
i∈S • j∈Ji
for • ∈ {+, −}.

It is well-known that the constraint of (PG) is equivalent to a set of minimal
cover inequalities [8]:

(1 − xj ) = 0 ⇐⇒ xj ≥ 1, i ∈ S −
i∈S − j∈Ji j∈Ji
The minimal cover inequalities provide a poor linear programming (LP) relax-
ation bound, however. For 0–1 linearly overestimating ϕ+ , McCormick con-
cave envelopes for a 0–1 monomial can serve the purpose (e.g., [2,10,12,14].)
This ‘standard’ method achieves the goal by means of introducing m+ (where
m+ = |S + |) variables

yi = (1 − xj ), i ∈ S + (1)
j∈Ji
and n × m+ inequalities
yi ≤ 1 − xj , j ∈ Ji , i ∈ S + (2)
378 K. Yan and H. S. Ryoo
to the formulation of a 0–1 linear relaxation of (PG). Alternatively, one may

aggregate the constraints in (2) with respect to j to concavity ϕ+ by m+ valid
inequalities (e.g., [13])

nyi + xj ≤ n, i ∈ S + ,
j∈Ji
or aggregate them with respect to i via standard probing techniques and logical
implications in integer programming (e.g., [5–7]) to

yi ≤ |Ij |(1 − xj ), j ∈ N ,
i∈Ij+
where Ij+ := {i ∈ S + | j ∈ Ji } for j ∈ N . As aggregation-based, these each

provide a weaker relaxation of ϕ+ than the McCormick relaxation but can prove
useful in data mining applications (e.g., see [16].)
As for ϕ+ , [17] represented + and − data under analysis as vertices in graph
and introduced an edge between each pair of + and − vertices that are 1 Ham-
ming distance apart. The analysis of the resulting graph has revealed that each
star of degree d (≥2) in the graph generates n + d valid inequalities for (PG)
that dominate n × d McCormick inequalities from the d leaf nodes (that is,
+ data) of the star. Furthermore, we showed that a set of ‘neighboring’ stars
further reduce the number of linear inequalities in the polyhedral relaxation of
(PG) while strengthen it and demonstrated cases when these inequalities are
facet-defining of the 0–1 multilinear polytope associated with the McCormick
inequalities that they replace.
Recently, [3] studied the facial structure of the polyhedral convex hull of
a 0–1 multilinear function via a hypergraph representation of the multilinear
function and proposed several lifting techniques for generating facet-defining
inequalities. The number of inequalities generated can be exponential in the
number of multilinear terms and/or variables, however. Plus, the hypergraph
representation of (PG) is never free of Berge or γ-cycles. This implies that results
from [4] cannot be (directly) utilized for a tight polyhedral convexification of
(PG). In short, we note (at least for now) that recent mathematical discoveries
from [3,4] do not constitute a viable means for solving (PG).
In this note, we enhance the approach in [17] to study a tight, multi-term
relaxation of ϕ+ by virtue of an improved, more effective graph representation
of data for analysis. More specifically, we discover a new useful ‘neighborhood’
property among data (with respect to generating stronger valid inequalities)
that introduces edges in the graph representation of data in a manner that
allows for the generation of a valid inequality from each maximal clique of the
graph. Once entangled, a set of neighboring data in a maximal clique allows one
to replace the corresponding McCormick inequalities by a single, much stronger
valid inequality. This gives rise to polyhedrally relaxing (PG) by means of a much
smaller number of stronger valid inequalities, in comparison with the methods
from the aforementioned references. With regards to results from [17], the new
results in this note subsume our earlier results; thus, yield a tighter polyhedral
relaxation of (PG) in terms of a smaller number of 0–1 linear inequalities.
As for organization, this note is consisted of three parts. Following this section
of introduction and background, we present the main results in Sect. 2 and fol-
low it by a preliminary numerical study. The numerical study compares our new
relaxation method against the McCormick relaxation method in pattern gen-
eration experiments with six machine learning benchmark datasets; recall that
McCormick provides the strongest lower bound when compared to two alterna-
tives in [5–7,13]. In short, the performance of the new results is far superior and
demonstrates their practical utilities in data mining applications well.
2 Main Results
Definition 1. A clique is a simple graph in which every pair of vertices has an

edge. A clique of size k is called a k-clique, where the size means the number of
vertices in a clique. A maximal clique is a clique that is not included in a larger
clique while a maximum clique in a graph is a clique with the largest number of
vertices.
Definition 2. A vertex clique cover is a set of cliques whose union cover all
vertices of a graph. A vertex maximal clique cover is vertex clique cover in which
each clique is maximal.
Definition 3. If πx ≤ π0 and μx ≤ μ0 are two valid inequalities for a polytope

in the nonnegative orthant, πx ≤ π0 dominates μx ≤ μ0 if there exists u > 0
such that uμ ≤ π and π0 ≤ uμ0 , and (π, π0 ) = (uμ, uμ0 ).
Finally, let
+

IP G := x ∈ {0, 1}2n , y ∈ [0, 1]m (1), ϕ− (x) = 0 .
For variable x regarding a pair of original and negated Boolean attributes, the
requirement for a logical statement gives the following complementary relation
which is of great importance in deriving stronger valid inequalities for IP G .
Proposition 1 (Proposition 1 in [16]). For j ∈ N , let j c = n + j if j ∈ N

and j c = j − n if j ∈ N . Then, the complementarity cut
xj + xj c ≤ 1, (3)
is valid for IP G .
In this section, we are interested in generating valid inequalities for IP G of

the form

yi ≤ 1 − xj , I ⊂ S + (4)
i∈I
with respect to a specific j ∈ N . When |I| = 1, (4) simply reduces to a

McCormick inequality. It is easy to see that as I grows larger, (4) becomes
stronger. Thus, for a tighter polyhedral relaxation of IP G , we wish to extend
I as large as possible while keeping (4) valid. For this purpose, we examine
0–1 features (variable x) to identify a maximal I. First, we consider a pair of
observations in Ij+ .
Lemma 1. For j ∈ N , suppose there are i, k ∈ Ij+ such that j ∈ J := Ji ∩ Jk

and J ⊂ J for some ∈ Ij− := {i ∈ S − |j ∈ Ji }. Then, the following inequality
yi + yk ≤ 1 − xj , (5)
is valid for IP G .
Proof. For A , ∈ Ij− , we have

(1 − xι ) = 0, (6)
ι∈J
which needs to be satisfied. Since j ∈ J , if xj = 1, (6) is satisfied and we also

have yi = yk = 0. Therefore
yi + yk = 0 ≤ 1 − xj .
On the other hand, suppose xj = 0 and consider variables with indices in

J \ {j}. It is easy to see that J \ {j} = (J \ J) ∪ (J \ {j}). If xι = 1 for any
ι ∈ J \ {j}, then (6) is satisfied and yi = yk = 0. That is,
yi + yk = 0 < 1 − xj .
Assume xι = 0, ∀ι ∈ J \ {j}, to satisfy (6), there must exist ι ∈ J \ J such that
xι = 1. Note that
N = J ∪ J c ∪ (Ji \ J ) ∪ (Jk \ J ) ,
where J c := {j c |j ∈ J}. One can see that ι ∈ J c , otherwise ιc ∈ J ⊂ J , which
contradicts ι ∈ J . This indicates either ι ∈ Ji \ J or ι ∈ Jk \ J. Without loss of
generality, assume ι ∈ Ji \ J, then yi ≤ 1 − xι = 0, and yk ≤ 1 − xιc = 1 via (3).
Thus
yi + yk ≤ 1 = 1 − xj .
This completes the proof.
To fully extend the result above, we represent the data under analysis in a
graph, as done so in [16,17]. The difference in this graph representation is that,
while each observation in Ij+ maps to a unique node in a graph, we now introduce
an edge between a pair of vertices if the pair satisfies the condition set forth in
Lemma 1. The resulting undirected graph is denoted by G+ j . Now, we have the
following result for each clique of G+
j .
Theorem 1. For j ∈ N , consider a clique which contains a set Ω(Ω ⊆

Ij+ , |Ω| ≥ 2) of observations in G+
j . Then, the following inequality

yi ≤ 1 − xj (7)
i∈Ω

is valid for IP G . Furthermore, the inequality above dominates i∈Ω yi ≤ 1 − xj

for any Ω ⊂ Ω.
Proof. From the way that G+ j is created we have yi + yk ≤ 1 − xj for each pair
i, k ∈ Ω via Theorem 1. So if xj = 1, one has yi = 0, ∀i ∈ Ω. Thus

yi = 0 ≤ 1 − xj .
i∈Ω
On the other hand, suppose xj = 0, then yi + yk ≤ 1 for each pair i, k ∈ Ω.

One can easily verify that at most one yi , i ∈ Ω can take value 1. That is,

yi ≤ 1 = 1 − xj .
i∈Ω
The dominance result is straightforward since (7) becomes stronger as Ω

expands.
To examine the strength of (7), we let:
⎧ ⎫
⎨ x ∈ {0, 1}2n , ⎬
Ξ := (3); x ≥ 1, i ∈ I −
; y ≤ 1 − x , j ∈ J , i ∈ Ω ∪ I +
⎩ y ∈ [0, 1] j i j i
j j c
⎭
j∈Ji
Theorem 2. For a clique with vertex set Ω in Theorem 1, (7) defines a facet
of conv(Ξ).
Proof. First note that (7) is valid via the proof of Theorem 1. Suppose that (7)
is not facet-defining. Then, there exists a facet-defining inequality of conv(Ξ) of
the form
αj xj + βi yi ≤ γ, (8)
j∈N i∈Ω∪Ij+c
where (α, β) = (0, 0), such that (7) defines a face of the facet of conv(Ξ) defined
by (8). That is:

F := (x, y) ∈ Ξ xj + yi = 1

i∈Ω
⎧ ⎫
⎪ ⎪
⎨ ⎬

⊆ F := (x, y) ∈ Ξ αj xj + βi yi = γ
⎪
⎩ ⎪
⎭
j∈N i∈Ω∪Ij+c
Consider the following two cases for the solutions in F .

Case 1. (xj = 1) In this case, xj c = 0 and yi = 0, ∀i ∈ Ω. Since j ∈ Ji , ∀i ∈ Ij− ,
a solution with xj = 1 satisfies all the minimal cover inequalities defining Ξ.
Such a solution with xι = 0, ∀ι ∈ N \ {j, j c }, and yi = 0, ∀i ∈ Ω ∪ Ij+c belongs
to F hence F , which yields αj = γ. For this solution, we can set xι = 1 for any
ι ∈ N \ {j, j c }, which yields αj + αι = γ. Therefore, we have:
αj = γ and αι = 0, ∀ι ∈ N \ {j, j c }
Furthermore, note that j ∈ Ij+c and a pattern exists for a contradiction-free
dataset. This implies that there exists a 0–1 vector x that yields yi = 1 for each
i ∈ Ij+c , which yields αj + βi = γ, i ∈ Ij+c , thus
βi = 0, ∀i ∈ Ij+c .
Case 2. (xj = 0) By the same argument that a pattern exists for a contradiction-
free dataset, we have βi = γ, i ∈ Ω and αj c + βi = γ, i ∈ Ω for solutions with
xj c = 0 and xj c = 1 respectively. These yield
αj c = 0 and βi = γ, ∀i ∈ Ω.
Summarizing, the two cases above show
αι = 0, ∀ι ∈ N \ {j}, βi = 0, ∀i ∈ Ij+c , and αj = βi = γ > 0, ∀i ∈ Ω,
where γ > 0 is from our supposition that (7) is dominated by (8). This shows
that (8) is a positive multiple of (7) and completes the proof.
We close this section with two remarks.
Remark 1. The last statement of Thorem 1 directs that only the maximal
cliques of G+ j need to be considered. As finding all maximal cliques is time-
consuming (e.g., [11],) we recommend instead using a vertex maximal clique
cover of G+j .
Remark 2. We wish to add that two main results – namely, Theorems 1 and 3 –
in [17] are subsumed by Theorem 1 above. Specifically, via Theorem 1, we obtain
a set of inequalities that dominate those by the forementioned results from [17]
in large part along with a small number of common ones. (We have theorems
and proofs for this but omit those here for reasons of space with respect to the
10 page limitation for the conference proceedings.) This helps greatly improve
the overall efficiency of LAD pattern generation via a more effective multi-term
relaxation of (PG) and its solution, as demonstrated in the following section.
3 A Preliminary Experiment
In this preliminary study, we compare the new results of the previous section
against the standard, McCormick relaxation method. We used six machine learn-
ing benchmark datasets from [9] for this experiment.
For notational simplicity, for • ∈ {+, −}, we denote by • ¯ the complementary

element of • with respect to the set {+, −}. For generating • patterns by the two
methods (for ϕ• ) compared, we used the minimal cover inequalities for linearizing
ϕ•¯ of (PG). The resulting 0–1 equivalent of (PG) by † ∈ {mccormick, cliques},
where mccormick and cliques denote the McCormick relaxation and the new
results in this note, respectively, takes the following form:

x∈{0,1}
maximize yi
2n , 0≤y≤1
i∈S •
(PG)•† : subject to (OBJ)•†

xj ≥ 1, i ∈ S •¯

j∈J
i
For experiments, we implemented a simple pattern generation procedure

below:
procedure pg† (for † ∈ {mccormick,cliques})

1: for • ∈ {+, −} do
2: if † = mccormick then
3: obtain (OBJ)•† via (2)
4: else
5: obtain a graph representation G• of data in S • .
6: for j ∈ N do
7: retrieve subgraph G•j from G• .
8: find a vertex maximal clique cover for G•j .
9: obtain (OBJ)•† via (7).
10: end for
11: end if
12: solve (PG)•† via CPLEX.
13: end for
All MILP instances generated were solved by CPLEX [1] with disallowing any
cuts to be added by the solver. For the choice of metric for comparison, we
adopted CPU time for solution and the root relaxation gap, defined as the dif-
ference between root node relaxation value and the optimum. We remind the
reader that the latter is a fair metric in that this value is least affected by an
arsenal of solution options and heuristics featured in a powerful MILP solver
such as CPLEX.
First, Table 1 provides the root node relaxation gap values in format ‘average
± 1 standard deviation’ of 30 results for each dataset, followed by the minimum
and the maximum values inside parenthesis. The numbers in the last column
measure the improvement in the root relaxation gap made by cliques, in com-
parison to mccormick.
Table 2 provides the CPU seconds for solving the instances of (PG)•mccormick
and (PG)•cliques in format ‘average ± 1 standard deviation’ of 30 results. We
note that the time for (PG)•cliques includes the time spent in finding graph rep-
resentation of data and vertex maximal clique covers. Again, the last column of
Table 1. Root relaxation gap
Dataset • (PG)•mccormick (PG)•cliques †

bupa + 82.6 ± 3.1 (75.2, 89.3) 5.0 ± 6.5 (0.0, 25.6) 94.1
− 85.0 ± 2.1 (78.7, 88.2) 5.6 ± 4.9 (0.0, 15.4) 93.5
clev + 66.2 ± 6.1 (53.6, 77.0) 3.9 ± 4.3 (0.0, 12.3) 94.4
− 71.6 ± 4.7 (63.0, 79.4) 2.2 ± 2.9 (0.0, 8.7) 97.1
cred + 70.2 ± 4.5 (58.0, 78.9) 5.3 ± 5.1 (0.0, 21.1) 92.7
− 75.3 ± 4.5 (64.1, 81.8) 7.7 ± 5.8 (0.0, 20.1) 90.1
diab + 88.5 ± 1.8 (81.9, 91.4) 16.6 ± 7.4 (0.0, 31.4) 81.3
− 80.4 ± 2.9 (73.9, 86.6) 3.4 ± 4.3 (0.0, 16.6) 95.9
hous + 65.1 ± 6.7 (52.1, 76.9) 2.2 ± 2.7 (0.0, 8.5) 96.7
− 74.2 ± 3.7 (66.1, 79.2) 3.2 ± 3.7 (0.0, 11.9) 95.8
wisc + 59.0 ± 5.2 (51.2, 70.4) 1.5 ± 2.5 (0.0, 9.1) 97.5
− 53.9 ± 4.8 (44.4, 65.3) 1.1 ± 2.3 (0.0, 8.2) 97.9
(PG)•mccormick and (PG)•cliques are solved without utilizing CPLEX
cuts
All results are in format ‘average ± 1 standard deviation (min, max)’
†: Measures relative efficacy of (PG)•cliques over (PG)•mccormick :
Gap by(PG)• by(PG)•cliques
mccormick −Gap
(Average of 30 values)
Worse of the 2 results
Table 2. CPU seconds
Dataset • (PG)•mccormick (PG)•cliques †

bupa + 0.38 ± 0.10 (0.24, 0.61) 0.04 ± 0.04 (0.00, 0.11) 82.88
− 0.73 ± 0.11 (0.44, 0.97) 0.10 ± 0.08 (0.00, 0.24) 80.97
clev + 0.07 ± 0.02 (0.03, 0.10) 0.02 ± 0.01 (0.00, 0.04) 69.26
− 0.10 ± 0.06 (0.04, 0.34) 0.01 ± 0.01 (0.00, 0.04) 77.82
cred + 1.28 ± 0.57 (0.45, 2.39) 0.25 ± 0.22 (0.00, 1.09) 66.79
− 0.96 ± 0.40 (0.39, 2.09) 0.26 ± 0.23 (0.01, 1.10) 61.96
diab + 3.13 ± 0.88 (1.49, 5.08) 0.69 ± 0.39 (0.01, 1.50) 73.39
− 4.93 ± 1.89 (2.68, 12.27) 0.40 ± 0.41 (0.01, 1.80) 85.89
hous + 0.16 ± 0.05 (0.08, 0.31) 0.03 ± 0.03 (0.00, 0.11) 71.61
− 0.24 ± 0.13 (0.10, 0.66) 0.04 ± 0.04 (0.00, 0.14) 73.81
wisc + 0.06 ± 0.02 (0.02, 0.09) 0.01 ± 0.01 (0.00, 0.03) 81.02
− 0.02 ± 0.01 (0.01, 0.03) 0.00 ± 0.01 (0.00, 0.02) 76.67
(PG)•mccormick and (PG)•cliques are solved without utilizing CPLEX cuts
Time by (PG)•cliques includes time for creating graphs and finding cliques.
†: Measures relative efficacy of (PG)•cliques over (PG)•mccormick :
Time for(PG)•mccormick −Time for(PG)•cliques
(Average of 30 values)
Worse of the 2 results
the table provides a measure of the overall improvement in efficiency of pattern

generation by the results of this note.
Briefly summarizing, the numbers in these tables show that the new results
of this note provide a tight polyhedral relaxation of (PG) and help achieve
a great deal of improvement in efficiency of pattern generation for LAD. In
note that pattern generation is notorious for being the bottleneck operation in
Boolean logical analysis of data, the comparative results in the two tables above
demonstrate practical utilities of the new mathematical results of this note well.
Before closing, we refer interested readers to Tables 5 and 6 in [17] to note
that new results of this note not only subsume their predecessors but also provide
a much tighter polyhedral relaxation of (PG) and, as a result, help in much faster
solution of the difficult-to-solve MILP pattern generation instances for Boolean
logical analysis of data.
References
1. IBM Corp.: IBM ILOG CPLEX Optimization Studio CPLEX User’s Manual
Version 12 Release 8 (2017). https://www.ibm.com/support/knowledgecenter/
SSSA5P 12.8.0/ilog.odms.studio.help/pdf/usrcplex.pdf. Accessed 12 Dec 2018
2. Crama, Y.: Concave extensions for nonlinear 0–1 maximization problems. Math.
Program. 61, 53–60 (1993)
3. Del Pia, A., Khajavirad, A.: A polyhedral study of binary polynomial programs.
Math. Oper. Res. 42(2), 389–410 (2017)
4. Del Pia, A., Khajavirad, A.: The multilinear polytope for acyclic hypergraphs.
SIAM J. Optim. 28(2), 1049–1076 (2018)
5. Fortet, R.: L’algèbre de boole dt ses applications en recherche opérationnelle.
Cahiers du Centre d’Études de Recherche Opérationnelle 1(4), 5–36 (1959)
6. Fortet, R.: Applications de l’algèbre de boole en recherche opérationnelle. Revue
Française d’Informatique et de Recherche Opérationnelle 4(14), 17–25 (1960)
7. Glover, F., Woolsey, E.: Converting the 0–1 polynomial programming problem to
a 0–1 linear program. Oper. Res. 12(1), 180–182 (1974)
8. Granot, F., Hammer, P.: On the use of boolean functions in 0–1 programming.
Methods Oper. Res. 12, 154–184 (1971)
9. Lichman, M.: UCI Machine Learning Repository (2013). http://archive.ics.uci.
edu/ml. Accessed 12 Dec 2018
10. McCormick, G.: Computability of global solutions to factorable nonconvex pro-
grams: part I-convex underestimating problems. Math. Program. 10, 147–175
(1976)
11. Moon, J.W., Moser, L.: On cliques in graphs. Isr. J. Math. 3(1), 23–28 (1965)
12. Rikun, A.: A convex envelope formula for multilinear functions. J. Glob. Optim.
10, 425–437 (1997)
13. Ryoo, H.S., Jang, I.Y.: MILP approach to pattern generation in logical analysis of
data. Discret. Appl. Math. 157, 749–761 (2009)
14. Ryoo, H.S., Sahinidis, N.: Analysis of bounds for multilinear functions. J. Glob.
Optim. 19(4), 403–424 (2001)
15. Yan, K., Ryoo, H.S.: 0–1 multilinear programming as a unifying theory for LAD
pattern generation. Discret. Appl. Math. 218, 21–39 (2017)
16. Yan, K., Ryoo, H.S.: Strong valid inequalities for Boolean logical pattern genera-
tion. J. Glob. Optim. 69(1), 183–230 (2017)
17. Yan, K., Ryoo, H.S.: A multi-term, polyhedral relaxation of a 0-1 multilinear func-
tion for Boolean logical pattern generation. J. Glob. Optim. https://doi.org/10.
1007/s10898-018-0680-8. (In press)
Gaining or Losing Perspective
Jon Lee1(B) , Daphne Skipper2 , and Emily Speakman3

1
jonxlee@umich.edu
2
U.S. Naval Academy, Annapolis, MD, USA
skipper@usna.edu
3
Otto-von-Guericke-Universität, Magdeburg, Germany
emily.speakman@ovgu.de
Abstract. We study MINLO (mixed-integer nonlinear optimization)

formulations of the disjunction x ∈ {0} ∪ [l, u], where z is a binary indi-
cator of x ∈ [l, u], and y “captures” xp , for p > 1. This model is useful
when activities have operating ranges, we pay a fixed cost for carrying
out each activity, and costs on the levels of activities are strictly convex.
One well-known concrete application (with p = 2) is mean-variance opti-
mization (in the style of Markowitz).
Using volume as a measure to compare convex bodies, we investigate a
family of relaxations for this model, employing the inequality yz q ≥ xp ,
parameterized by the “lifting exponent” q ∈ [0, p − 1]. These models
are higher-dimensional-power-cone representable, and hence tractable in
theory. We analytically determine the behavior of these relaxations as
functions of l, u, p and q. We validate our results computationally, for
the case of p = 2. Furthermore, for p = 2, we obtain results on asymp-
totic behavior and on optimal branching-point selection.
Keywords: Mixed-integer nonlinear optimization · Volume · Integer ·

Relaxation · Polytope · Perspective · Higher-dimensional power cone
Introduction
Background. Our interest is in studying “perspective reformulations”. This

technique has been used in the presence of indicator variables: when an indicator
is “off”, a vector of decision variables is forced to a specific point, and when it
is “on”, the vector of decision variables must belong to a specific convex set. [6]
studied such a situation where binary variables manage terms in a separable-
quadratic objective function, with each continuous variable x being either 0 or in
a positive interval (also see [4]). The perspective-reformulation approach (see [6]
J. Lee was supported in part by ONR grant N00014-17-1-2296 and LIX, l’École Poly-
technique. D. Skipper was supported in part by ONR grant N00014-18-W-X00709.
E. Speakman was supported by the Deutsche Forschungsgemeinschaft (DFG, German
Research Foundation) - 314838170, GRK 2297 MathCoRe.
https://doi.org/10.1007/978-3-030-21803-4_39
388 J. Lee et al.
and the references therein) leads to very strong conic-programming relaxations,

but not all MINLO (mixed-integer nonlinear optimization) solvers are equipped
to handle these. So one of our interests is in determining when a natural and
simpler non conic-programming relaxation may be adequate.
Generally, our view is that MINLO modelers and algorithm/software devel-
opers can usefully factor in analytic comparisons of relaxations in their work.
d-dimensional volume is a natural analytic measure for comparing the size of
a pair of convex bodies in Rd . [8] introduced the idea of using volume as a
measure for comparing relaxations (for fixed-charge, vertex packing, and other
relaxations). [10–13] extended the idea to relaxations of graphs of trilinear mono-
mials on box domains. Following up on work of [7,9,14] compared relaxations of
graphical Boolean-quadric relaxations. [2,3] use volume cut off as a measure for
the strength of cuts.
Our view of the current relevant convex-MINLO software environment is
that it is very unsettled with a lot to come. One of the best algorithmic options
for convex-MINLO is “outer approximation”, but this is not usually appropri-
ate when constraint functions are not convex (even when the feasible region of
the continuous relaxation is a convex set). Even “NLP-based B&B” for convex-
MINLO may not be appropriate when the underlying NLP solver is presented
with a formulation where a constraint qualification does not hold at likely
optima. In some situations (like ours), the relevant convex sets can be repre-
sented as convex cones, thus handling the constraint-qualification issue—but
introducing non-differentiability at likely optima. In this way of thinking, conic
constraints are not well handled by general convex-MINLO software (like Kni-
tro, Ipopt, Bonmin, etc.). The only conic solver that handles integer variables
(via B&B) is MOSEK, and then only quadratic cones, and “as long as they
do not contain both quadratic objective or constraints and conic constraints at
the same time”. So not all of our work can be applied today, within the cur-
rent convex-MINLO software environment, and so we see our work as forward
looking.
Our Contribution and Organization. We study MINLO formulations of

the disjunction x ∈ {0} ∪ [l, u], where z is a binary indicator of x ∈ [l, u], and
y “captures” xp , for p > 1 (see [1], for example). We investigate a family of
relaxations for this model, employing the inequality yz q ≥ xp , parameterized
by the “lifting exponent” q ∈ [0, p − 1]; we make the convention that 00 = 1
(relevant when z = 0 and q = 0). These models are higher-dimensional-power-
cone representable, and hence tractable in theory. We bound our formulations
using the linear inequality up z ≥ y which is always satisfied at optimality (for the
typical application where y replaces xp in a minimization objective). In Sect. 1
we formally define the sets relevant to our study.
For q = 0, we have the most most naı̈ve relaxation using y ≥ xp . For q = 1,
we have the naı̈ve perspective relaxation using yz ≥ xp . For q = p − 1, we get
the true perspective relaxation using yz p−1 ≥ xp , which gives the convex hull.
Interestingly, this last fact seems to be only very-well known when p = 2, in
Gaining or Losing Perspective 389
which case p − 1 = 1 and the naı̈ve perspective relaxation is the true perspective
relaxation. So some might think, even for p > 2, that q = 1 would give the
convex hull—but this naı̈ve perspective relaxation is not the strongest; we need
to use q = p − 1 to get the convex hull.
In Sect. 2, we present a formula for the volumes of all of these relaxations as
a means of comparing them. In doing so, we quantify, in terms of l, u, p, and q,
how much stronger the convex hull is compared to the weaker relaxations, and
when, in terms of l and u, there is much to be gained at all by considering more
than the weakest relaxation. Using our formula, and thinking of the baseline
of q = 1, namely the naı̈ve perspective relaxation, we quantify the impact of
“losing perspective” (e.g., going to q = 0, namely the most naı̈ve relaxation) and
of “gaining perspective” (e.g., going to q = p − 1, namely the true perspective
relaxation). Depending on l and u for a particular x (of which there may be a
great many in a real model), we may adopt different relaxations based on the
differences of the volumes of the various relaxation choices and on the solver
environment. For p = 2, we obtain further results on asymptotic behavior and
on optimal branching-point selection.
Compared to earlier work on volume formulae and related branching-point
selection relevant to comparing convex relaxations, our present results are the
first involving convex sets that are not polytopes. Thus we demonstrate that we
can get meaningful results that do not rely on triangulation of polytopes.
In Sect. 3 we present some computational experiments (for p = 2) which
bear out our theory, as we verify that volume can be used to determine which
variables are more important to handle by perspective relaxation.
Notation. Throughout, we use boldface lower-case for vectors and boldface

upper-case for matrices, vectors are column vectors, · indicates the 2-norm,
and for a vector x, its transpose is indicated by x .
1 Definitions
For real scalars u > l > 0 and p > 1, we define

Sp := (x, y, z) ∈ R2 × {0, 1} : y ≥ xp , uz ≥ x ≥ lz ,
and, for 0 ≤ q ≤ p − 1, the associated relaxations

Spq := (x, y, z) ∈ R3 : yz q ≥ xp , uz ≥ x ≥ lz, 1 ≥ z ≥ 0, y ≥ 0 .
Note that even though xp − yz q is not a convex function for q > 0 (even for
p = 2, q = 1), the set Spq is convex. In fact, the set Spq is higher-dimensional-
power-cone representable, which makes working with it appealing. Still, com-
putationally handling higher-dimensional power cones efficiently is not a trivial
matter, and we should not take it on without considering alternatives.
These sets are unbounded in the increasing-y direction. This is rather incon-
venient because we want to assess relaxations by computing their volumes. But
390 J. Lee et al.
in applications, y is meant to model/capture xp via the minimization pressure

of an objective function. So for our purposes, we introduce the linear inequality
up z ≥ y, which captures that z = 0 implies xp = 0, and that z = 1 implies
up ≥ xp . For convenience, we write S̄ := S ∩ (x, y, z) ∈ R3 : up z ≥ y , for
S ∈ {Sp , Spq , }. So, S¯pq is a relaxation of S¯p . The following result, part of which
is closely related to results in [1], is easy to establish.
Proposition 1. For p > 1 and q ∈ [0, p − 1], (i) S̄p ⊆ S̄pq , (ii) S̄pq is a convex

set, (iii) S̄pq ⊆ S̄pq , for 0 ≤ q ≤ q, and (iv) conv(S¯p ) = S̄pp−1 .
2 Our Results
2.1 Volumes
Theorem 2. For p > 1 and 0 ≤ q ≤ p − 1,
(p2 − pq + 3p − q − 1)up+1 + 3lp+1 − (p + 1)(p − q + 2)lup

vol(S̄pq ) = .
3(p + 1)(p − q + 2)
Proof. Case 1: 0 < q ≤ p − 1. We proceed using standard integration tech-

niques, and we begin by fixing the variable y and considering the corresponding
2-dimensional slice, Ry , of S̄pq . In the (x, z)-space, Ry is described by:
z ≥ xp/q y −1/q (1) z ≤ x/l (4)

z ≥ x/u (2) z≤1 (5)
z ≥ y/up (3) z≥0 (6)
Inequality (6) is implied by (3) because y ≥ 0. Therefore, for the various choices
of u, l, and y, the tight inequalities for Ry are among (1), (2), (3), (4), and (5).
In fact, the region will always be described by either the entire set of inequalities
(if y > lp ), or (1), (2), (3), and (4) (if y ≤ lp ). For an illustration of these two
cases with p = 5 and q = 3, see Figs. 1, 2.
To understand why these two cases suffice, observe that together (2) and (4)
create a ‘wedge’ in the positive orthant. Ry is composed of this wedge intersected
with {(x, z) ∈ R2 : z ≥ xp/q y −1/q }, for uyp ≤ z ≤ 1. With a slight abuse of
notation, based on context we use (k), for k = 1, 2, . . . , 5, to refer both to the
inequality defined above and to the 1-d boundary of the region it describes.
Now consider the set of points formed by the wedge and the inequality
z ≥ xp/q y −1/q . Curves (1) and (4) intersect at (0, 0) and at a = (xa , za ) :=
y 1/(p−q) y 1/(p−q)
lq , p . Curves (1) and (2) intersect at (0, 0) and at b =
l
(xb , zb ) := uyq , uyp
1/(p−q) 1/(p−q)
. To understand the area that we are seek-
ing to compute, we need to ascertain where (0, 0), a, and b fall relative to (3)
Fig. 1. 0 < q ≤ p − 1, y < lp
Fig. 2. 0 < q ≤ p − 1, y > lp
and (5), which bound the region uyp ≤ z ≤ 1. Note that the origin falls on or
below (3), and because u > l, a is always above b (in the sense of higher value
of z).
We show that b must fall between lines (3) and (5). This is equivalent to
y
y 1/(p−q)
up ≤ up = zb ≤ 1. Now, we know y ≤ up , which implies uyp ≤ 1.
From our assumptions on p and q, we also have 0 < p−q 1
≤ 1. From this we can
y
y 1/(p−q)
immediately conclude up ≤ up = zb ≤ 1.
Furthermore, given that a must be above b, we now have our two cases: a is
either above (5) (if y > lp ), or on or below (5) (if y ≤ lp ). Using the observations
made above, we can now calculate the area of Ry via integration. We integrate
392 J. Lee et al.
over z, and the limits of integration depend on the value of y. If y ≤ lp , then the
area is given by the expression:
zb za
1
(uz − lz) dz + (yz q ) p − lz dz.
y
up zb
If y ≥ lp , then the area is given by the expression:

zb 1
1
(uz − lz) dz + (yz q ) p − lz dz.
y
up zb
Note that when y = lp , these quantities are equal. Furthermore, when q = p − 1

(and we have the hull), the first integral in each sum is equal to zero.
Integrating over y, we compute the volume of S̄pq as follows:
⎛ ⎞
lp ( uyp )1/(p−q) ( lyp )1/(p−q)
1
⎝ (uz − lz) dz + (yz q ) − lz
p dz ⎠ dy
y 1/(p−q)
0 up ( uyp )
⎛ ⎞
up ( uyp )1/(p−q) 1
1
+ ⎝ (uz − lz) dz + (yz q ) p − lz dz ⎠ dy
y y 1/(p−q)
lp u p ( u p )
(p2 − pq + 3p − q − 1)up+1 + 3lp+1 − (p + 1)(p − q + 2)lup
= .
3(p + 1)(p − q + 2)
Case 2: q = 0. This case is similar to Case 1. However, here we need to ensure

that we avoid division by zero. Consider a slice of the set S̄p0 for fixed y, again
denoted Ry . In the (x, z)-space, Ry is described by inequalities (2)–(6) and the
following inequality (which replaces (1)):
x ≤ y 1/p . (1 )
Similarly to before, we have that for the various choices of u, l, and y, Ry is

described by either the entire set of inequalities (if y > lp ), or (1 ), (2), (3), and
(4) (if y ≤ lp ). The reason it suffices to consider the two cases is very similar to
the case of q > 0: consider the triangle formed by the ‘wedge’ inequalities i.e.,
(2) and (4), and the inequality x ≤ y 1/p . The vertices of this triangle are (0, 0),
1/p 1/p
a = (xa , za ) := (y 1/p , y l ), and b = (xb , zb ) := (y 1/p , y u ). Again, we want to
understand where a and b fall in relation to the lines (3) and (5) describing the
region uyp ≤ z ≤ 1, and we know that the origin always falls below or on the
bottom line. Furthermore, we again have that a is above b (because u > l).
We show that b must fall between the two lines (3) and (5). We know y ≤ up ,
which implies uyp ≤ 1. We also know 0 < p1 < 1. From this we can conclude
y
y 1/p
up ≤ up = zb ≤ 1. We have the same situation as when q > 0: a will either
be above both lines, or between the two; b will always be between the two lines.
Similarly to before, we can use this information to compute the volume of S̄p0 :
⎛ ⎞
lp y 1/p
u
y 1/p
l
1
vol(S̄p0 ) = ⎝ (uz − lz) dz + y − lz
p dz ⎠ dy
y y 1/p
0 up u
⎛ ⎞
up y 1/p
u
1
1
+ ⎝ (uz − lz) dz + y − lz
p dz ⎠ dy
y y 1/p
lp up u
(p2 + 3p − 1)up+1 + 3lp+1 − (p + 1)(p + 2)lup

= .
3(p + 1)(p + 2)
Substituting q = 0 into the theorem statement gives this last expression.
We can now precisely quantify how much better the convex-hull perspective
relaxation (q = p − 1) is compared to the most naı̈ve relaxation (q = 0):
Corollary 3. For p > 1,

(p − 1)(up+1 − lp+1 ) u3 − l 3
vol(S̄p0 ) − vol(S̄pp−1 ) = = , for p = 2 .
3(p + 1)(p + 2) 36
We can also precisely quantify how much better the convex-hull perspective
relaxation (q = p − 1) is compared to the naı̈ve perspective relaxation (q = 1):
Corollary 4. For p ≥ 2,
(p − 2)(up+1 − lp+1 )
vol(S̄p1 ) − vol(S̄pp−1 ) = .
3(p + 1)2
2.2 Asymptotics: The Case of p = 2
It is a direct consequence of our volume formula for p = 2, that in a natural

asymptotic regime, the excess volume of the most naı̈ve relaxation, above the
volume of the true perspective relaxation, is not a vanishingly small part of the
volume of the most naı̈ve relaxation.
Corollary 5. For l = ku, with constant k ∈ [0, 1),
vol(S̄20 ) − vol(S̄21 ) 1 + k + k2 1
lim = ≥ .
u→∞ 0
vol(S̄2 ) 3(3 − k − k )
2 9
2.3 Branching-Point Selection: The Case of p = 2
“Branching-point selection” is a key algorithmic issue in sBB (spatial branch-

and-bound), the main algorithm used in global optimization of “factorable for-
mulations”. [11] introduced the investigation of “branching-point selection” using
volume as a measure. Their idea is to determine the point at which we can split
394 J. Lee et al.
the domain of a variable, so that re-convexifying the two child relaxations yields
the least volume.
For S ∈ {S̄21 , S̄20 }, let vS (x̂) be the sum of the volumes of the two pieces
of S created by branching on x at x̂ ∈ [l, u]. Interestingly, the branching-point
behavior of S̄21 and S̄20 are identical.
Theorem 6. √ For S ∈ {S̄21 , S̄20 }, vS is strictly convex on [l, u], and its minimum
is at x̂ = (l + l2 + 3u2 )/3.
Additionally, this suggests biasing branching-point selection up from the com-

mon choice of mid-point branching:
√
Corollary 7. For S ∈ {S̄21 , S̄20 }, the optimal branching point is at least u/ 3 ≈
0.57735 u, which is achieved if and only if l = 0.
3 Computational Experiments: The Case of p = 2

We carried out experiments on a 16-core machine (running Windows Server
2012 R2): two Intel Xeon CPU E5-2667 v4 processors running at 3.20GHz, with
8 cores each, and 128 GB of memory. We used the conic solver SDPT3 4.0 ([15])
under the Matlab “disciplined convex optimization” package CVX ([5]).
3.1 Separable Quadratic-Cost Knapsack Covering

Our first experiment is based on the following model, which we think of as
a relaxation of the identical model having the constraints zi ∈ {0, 1} for i =
1, 2, . . . , n. The data c, f , a, l, u ∈ Rn and b ∈ R are all positive. The idea is that
we have costs ci on x2i , and xi is either 0 or in the “operating range” [li , ui ]. We
pay a cost fi when xi is nonzero.
min c y + f z
subject to:
a x ≥ b ;
ui zi ≥ xi ≥ li zi , i = 1, . . . , n ;
≥ yi ≥
u2i zi x2i , i = 1, . . . , n ;
1 ≥ zi ≥ 0, i = 1, . . . , n .
For some of the i, we conceptually replace yi ≥ x2i with its perspective

tightening yi zi ≥ x2i , yi ≥ 0; really, we are using a conic solver, so we instead
employ an SOCP representation. We do this for the choices of i that are the
k highest according to a ranking of all i, 1 ≤ i ≤ n. We let k = n(j/15),
with j = 0, 1, 2, . . . , 15. Denoting the polytope with no tightening by Q and
with tightening by P , we looked at three different rankings: descending values of
vol(Q)−vol(P ) = (u3i −li3 )/36, ascending values of vol(Q)−vol(P ), and random.
For n = 30, 000, we present our results in Fig. 3. As a baseline, we can see that if
we only want to apply the perspective relaxation for some pre-specified fraction
of the i’s, we get the best improvement in the objective value (thinking of it as a
lower bound for the true problem with the constraints zi ∈ {0, 1}) by preferring i
with the largest value of u3i −li3 . Moreover, most of the benefit is already achieved
at much lower values of k than for the other rankings.
[8,12] suggested that for a pair ofrelaxationsP, Q ⊂ Rd , a good measure for
d d
evaluating Q relative to P might be vol(Q) − vol(P ) (in our present setting,
we have d = 3). We did experiments ranking by this, rather then the simpler
vol(Q) − vol(P ), and we found no significant difference in our results. This can
be explained by the fact that ranking by either of these choices is very similar
for our test set.
3.2 Mean-Variance Optimization
Next, we conducted a similar experiment on a richer model, though at a smaller

scale. Our model is for a (Markowitz-style) mean-variance optimization problem
(see [4,6]). We have n investment vehicles. The vector a contains the expected
returns for the portfolio/holdings x. The scalar b is our requirement for the
minimum expected return of our portfolio. Asset i has a possible range [li , ui ],
and we limit the number of assets that we hold to κ.
Variance is measured, as usual, via a quadratic which is commonly taken
to have the form: x (Q + Diag(c)) x, where Q is positive definite and c is all
positive (see [4,6] for details on why this form is used in the application). Taking
the Cholesky factorization Q = MM , we define w := M x, and introduce the
scalar variable v. In this way, we arrive at the model:
Fig. 3. Separable-quadratic knapsack covering, n = 30, 000

396 J. Lee et al.
min v + c y
subject to:

ax≥b;
e z ≤ κ ;
w − M x = 0 ;
v ≥ w2 ;
ui zi ≥ xi ≥ li zi , i = 1, . . . , n ;
u2i zi ≥ yi ≥ x2i , i = 1, . . . , n ;
1 ≥ zi ≥ 0, i = 1, . . . , n ;
wi unrestricted, i = 1, . . . , n .
Note: The inequality v ≥ w2 is correct; there is a typo in [6], where it is written
as v ≥ w. The inequality v ≥ w2 , while not formulating a Lorentz (second-
order) cone, may be re-formulated as a an affine slice of a rotated Lorentz cone,
or not, depending on the solver employed.
Our results followed the same general trend seen in Fig. 3.
References
1. Aktürk, M.S., Atamtürk, A., Gürel, S.: A strong conic quadratic reformulation for
machine-job assignment with controllable processing times. Oper. Res. Lett. 37(3),
187–191 (2009)
2. Basu, A., Conforti, M., Di Summa, M., Zambelli, G.: Optimal cutting planes from
the group relaxations. arXiv:abs/1710.07672 (2018)
3. Dey, S., Molinaro, M.: Theoretical challenges towards cutting-plane selection.
arXiv:abs/1805.02782 (2018)
4. Frangioni, A., Gentile, C.: Perspective cuts for a class of convex 0–1 mixed integer
programs. Math. Program. 106(2), 225–236 (2006)
5. Grant, M., Boyd, S.: CVX: Matlab software for disciplined convex programming,
version 2.1, build 1123. http://cvxr.com/cvx (2017)
6. Günlük, O., Linderoth, J.: Perspective reformulations of mixed integer nonlinear
programs with indicator variables. Math. Program. Ser. B 124, 183–205 (2010)
7. Ko, C.W., Lee, J., Steingrı́msson, E.: The volume of relaxed Boolean-quadric and
cut polytopes. Discret. Math. 163(1–3), 293–298 (1997)
8. Lee, J., Morris Jr., W.D.: Geometric comparison of combinatorial polytopes. Dis-
cret. Appl. Math. 55(2), 163–182 (1994)
9. Lee, J., Skipper, D.: Volume computation for sparse boolean quadric relaxations.
Discret. Appl. Math. (2017). https://doi.org/10.1016/j.dam.2018.10.038.
10. Speakman, E., Lee, J.: Quantifying double McCormick. Math. Oper. Res. 42(4),
1230–1253 (2017)
11. Speakman, E., Lee, J.: On branching-point selection for trilinear monomials in
spatial branch-and-bound: the hull relaxation. J. Glob. Optim. (2018). https://
doi.org/10.1016/j.dam.2018.10.038.
12. Speakman, E., Yu, H., Lee, J.: Experimental validation of volume-based compar-
ison for double-McCormick relaxations. In: Salvagnin, D., Lombardi, M. (eds.)
CPAIOR 2017, pp. 229–243. Springer (2017)
13. Speakman, E.E.: Volumetric guidance for handling triple products in spatial
branch-and-bound. Ph.D., University of Michigan (2017)
14. Steingrı́msson, E.: A decomposition of 2-weak vertex-packing polytopes. Discret.
Comput. Geom. 12(4), 465–479 (1994)
15. Toh, K.C., Todd, M.J., Tütüncü, R.H.: SDPT3-a MATLAB software package for
semidefinite programming. Optim. Methods Softw. 11, 545–581 (1998)
Game Equilibria and Transition Dynamics
with Networks Unification
Alexei Korolev(&) and Ilia Garmashov
National Research University “Higher School of Economics” Saint-Petersburg,

Saint-Petersburg, Russia
danitschi@gmail.com, iagarmashov@edu.hse.ru
Abstract. In this paper, we consider the following problem - what affects the
Nash equilibrium amount of investment in knowledge when one of the complete
graph enters another full one. The solution of this problem will allow us to
understand exactly how game agents will behave when deciding whether to
enter the other net, what conditions and externalities affect it and how the level
of future equilibrium amount of investments in knowledge can be predicted.
Keywords: Network Network game Nash equilibrium Externality

Productivity Innovation cluster
1 Introduction
The processes of globalization, post-industrial development and digitalization of the

economy make studying of the role of innovative firms in the world economic
development extremely significant. In papers [1, 3] mathematical models of the
international innovative economy are constructed, on the basis of which the behavior of
innovative firms is analyzed. In particular, authors of this article consider an important
topic: how do firms realize their investment strategy in the development of knowledge,
including outside their own region or country. The behavior of agents is determined by
various externalities, which can have a completely different nature. Description of
secondary effects is one of the most important directions in network game theory that
authors of different articles try to analyze (for example, [5, 6]).
There is also another aspect of the question: how to structure and organize their
behavior in the best way in constantly changing economic and social conditions. In [2],
the authors of the article try to take a new look at the system of organizing the actions
of agent-innovators. It is important to take into account the impact (externalities) that
influence agents by the environment, including other network entities. The article [4]
shows the necessity of creation of regional innovative systems based on clusters. From
this follows the relevance of the model description of the process of creating more
extensive innovative clusters based on existing ones.
In addition, there is a need to model the process of changing the Nash equilibrium
investment values, as well as the search for new internal or angular ones.

https://doi.org/10.1007/978-3-030-21803-4_40
Game Equilibria and Transition Dynamics 399
This article continues the study of Nash equilibria and its changes in the process of
unification complete graphs. However, this paper contains a number of new elements in
comparison with previous studies.
To begin with, we study dynamic behavior of agents, not only by generalizing the
simple two-period model of endogenous growth of Romer with the production and
externalities of knowledge (as in paper [7]), but also by using difference equations.
Moreover, we assume that our agents are innovative companies that are interested in
knowledge investmens.
The main content of the article is focused on the analysis of changes in the Nash
equilibrium investment values, as well as the description of angular solutions. We study
necessary and sufficient conditions, and possible limitations for the appearance of new
different equilibria.
The main problem of this research is to study the differences in agents’ behavior
during networks unification of different sizes and relations between amount of actors’
knowledge investments and their productivity.
To achieve this goal, the following objectives should be fulfilled:
(1) to create the model which can describe agents’ decisions in amount of knowledge
investments;
(2) to find the equilibrium condition of this model hat shows the optimal choices of
each network agent;
(3) outline the relations between agent’s productivity, network size and value of
knowledge invesments.
2 Model Description
There is a network (undirected graph) with n nodes, i ¼ 1; 2; ::; n; each node represents
an agent. In period 1 each agent i possesses initial endowment of good, e, and uses it
partially for consumption in first period of life, ci1 , and partially for investment into
knowledge, ki :
ci1 þ ki ¼ e; i ¼ 1; 2; . . .; n: ð1Þ
Investment immediately transforms one-to-one into knowledge which is used in pro-

duction of good for consumption in second period, ci2 .
Preferences of agent i are described by quadratic utility function:

Ui ci1 ; ci2 ¼ ci1 e aci1 þ bi ci2 ; ð2Þ
where bi [ 0; a is a satiation coefficient, bi is a parameter, characterized the value of

comfort and health in the second period of life compared to consumption in the first
period. It is assumed that ci1 2 ½0; e, the utility increases in ci1 , and is concave with
respect to ci1 . These assumptions are equivalent to condition 0\a\1=2.
Production in node i is described by production function:
400 A. Korolev and I. Garmashov
F ðki ; Ki Þ ¼ Bi ki Ki ; Bi [ 0; ð3Þ
which depends on the state of knowledge in i-th node, ki , and on environment, Ki , Bi is

a technological coefficient. The environment is the sum of investments by the agent
himself and her neighbors:
X
~i; K
K i ¼ ki þ K ~i ¼ kj ; ð4Þ
j2N ðiÞ
where N ðiÞ – is the set of neighboring nodes of node i.

We will denote the product bi Bi by Ai and assume that a\Ai . Since increase of any
of parameters bi ; Bi promotes increase of the second period consumption, we will call
Ai “productivity”. We will assume that Ai 6¼ 2a; i ¼ 1; 2; . . .; n: If Ai [ 2a, we will say
that i-th agent is productive, and if Ai \2a, we will say that i-th agent is unproductive.
Three ways of behavior are possible: agent i is called passive if she makes zero
investment, ki ¼ 0 (i.e. consumes the whole endowment in period 1); active if
0\ki \e; hyperactive if she makes maximally possible investment e (i.e. consumes
nothing in period 1).
Let us consider the following game. Players are the agents i ¼ 1; 2; . . .; n. Possible
actions (strategies) of player i are values of investment ki from the segment ½0; e. Nash
equilibrium with externalities
(forshortness, equilibrium) is a profile of knowledge
levels (investments) k1 ; k2 ; . . .; kn , such that each ki is a solution of the following
problem PðKi Þ of maximization of i-th player’s utility given environment Ki :

Ui ci1 ; ci2 !½ci1 ; ci2 ; ki max
8
>
< ci1 e ki ;
; ð5Þ
ci2 F ðki ; Ki Þ;
>
: i
c1 0; ci2 0; ki 0;

where the environment Ki is defined by the profile k1 ; k2 ; . . .; kn :
X
Ki ¼ ki þ k :
j2N ðiÞ j
ð6Þ
The first two constraints of problem PðKi Þ in the optimum point are evidently satisfied
as equalities. Substituting into the objective function, we obtain a new function (payoff
function):
Vi ðki ; Ki Þ ¼ e2 ð1 aÞ ki eð1 2aÞ aki2 þ Ai ki Ki ð7Þ
If all players’ solutions are internal ð0\ki \e; i ¼ 1; 2; . . .; nÞ, i.e. all players are
active, the equilibrium will be referred as inner equilibrium. Clearly, the inner equi-
librium (if it exists for given values of parameters) is defined by the system
D1 Vi ðki ; Ki Þ ¼ 0; i ¼ 1; 2; . . .; n ð8Þ
or
D1 Vi ðki ; Ki Þ ¼ eð2a 1Þ 2aki þ Ai Ki ; i ¼ 1; 2; . . .; n ð9Þ
We introduce the following notation. Regardless of agent type of behavior the

equation root
~ i eð1 2aÞ ¼ 0
D1 Vi ðki ; Ki Þ ¼ ðAi 2aÞki þ Ai K ð10Þ
will be denoted by ~kis . Thus,
~
~ks ¼ eð2a 1Þ þ Ai Ki ; ð11Þ
i
2a Ai
where K ~ i – pure externality of agent i. It is obvious, that if agent i is active, then his
investments will be equal to ~kis in equilibrium. To analyze equilibriums we need the
following statement.
Proposition 1. ([5], Lemma 2.1 and Corollary 2.1) A set of investment agent values
ðk1 ; k2 ; . . .; kn Þ can be an equilibrium only if for each i ¼ 1; 2; . . .; n it is true that
~i
1. if ki ¼ 0, then K eð12aÞ
Ai ;
2. if 0\ki \e, then ki ¼ ~kis ;
3. if ~ i eð1Ai Þ.
ki ¼ e, then K Ai
3 Unification of Innovative Nets
Let us consider the following situation. There are two clicks with n1 and n2 edges
respectively, with the same agents’ productivity: A.
There are three types of agents in the united network. The actors of the first type are all
the agents of the first network, besides the agent of the first network, which connected to
the agents of the second network. Actors of the second type are all agents of the second
network. The third type of actors is only one agent – this is the agent of the first network
that connected to all actors of the second network. Since all agents of the same type will
have the same environment, they will behave in the same way, not only in equilibrium,
but also in dynamics. Therefore, the investment of each agent of the type i will be denoted
ki , and the environment of each agent of the type i will be denoted Ki .
Both the clicks are initially in inner equilibrium. It follows immediately from (9)
that the initial investment of agents is the following
eð1 2aÞ eð1 2aÞ

k10 ¼ k30 ¼ ; k20 ¼ ð12Þ
n1 A 2a n2 A 2a
The system (9) for inner equilibrium in joined network is

8
< ððn1 1ÞA 2aÞk1 þ Ak3 ¼ eð1 2aÞ;
ðn2 A 2aÞk2 þ Ak3 ¼ eð1 2aÞ; ð13Þ
:
ðn1 1ÞAk1 þ n2 Ak2 þ ðA 2aÞk3 ¼ eð1 2aÞ:
Solving this system by Kramer method we obtain for inner equilibria new equilibrium
investment amount:
2aeð1 2aÞðn2 A 2aÞ

k1 ¼ ð14Þ
ðn1 1Þn2 A3 þ 2aðn1 1Þn2 A2 þ 8a3 4a2 ðn1 þ n2 ÞA
2aeð1 2aÞ½ðn1 1ÞA 2a

k2 ¼ ð15Þ
eð1 2aÞ½ðn1 1Þn2 A2 4a2

k3 ¼ ð16Þ
4 Adjusting Dynamics in Networks and Dynamic Stability

of Equilibria
After the description of two-stage model equilibria, we can go to adjusting dynamics.

We introduce adjustment dynamics which may start after a small deviation from
equilibrium or after junction of networks each of which was initially in equilibrium. We
model the adjustment dynamics in the following way.
Definition 1. Each agent maximizes her utility by choosing a level of investment; at
the moment of decision-making she considers her environment as exogenously given.
Correspondingly, if kin ¼ 0 and D1 Vi ðki ; Ki Þjki ¼0 0, then kin þ 1 ¼ 0, and if kin ¼ e and
D1 Vi ðki ; Ki Þjki ¼e 0, then kin þ 1 ¼ e; in all other cases, kin þ 1 solves the difference
equation:
2akit þ 1 þ bi Kit eð1 2aÞ ¼ 0: ð17Þ
Definition 1 implies that the dynamics in model under consideration is described by the
system of difference equations:
8
>
> k1t þ 1 ¼ ðn1 2a
1ÞA t
k1 þ eð2a1Þ
2a ;
A t
2a k3 þ
<
tþ1 eð2a1Þ
k2 ¼ 2a k2 þ 2a k3 þ 2a ;
n2 A t A t ð18Þ
>
>
: t þ 1 ðn1 1ÞA t n2 A t
k3 ¼ 2a k1 þ 2a k2 þ 2a k3 þ eð2a1
A t Þ
2a :
where t ¼ 0; 1; 2. . .
Characteristic equation for this system is
ðn1 þ n2 ÞA ðn1 1Þn2 A2 ðn1 1Þn2 A3

k3 þ k2 k ¼ 0: ð19Þ
2a 4a2 8a3
Definition 2. The equilibrium is called dynamically stable if, after a small deviation of one
of the agents from the equilibrium, dynamics starts which returns the equilibrium back to
the initial state. In the opposite case the equilibrium is called dynamically unstable.
5 Network Dynamics Model of Net Unification
To find the eigenvalues of difference equations (18) in general we need to impose the
restrictions: n2 ¼ n1 1 ¼ n. Then the Eq. (19) takes the form
nA
k 0 A
2a 2a ðn þ 1ÞA nA2
nA
0 nA
k A
¼ k k k
2
2 ¼ 0; ð20Þ
2a 2a 2a 2a 4a
nA nA A
k
2a 2a 2a
hence
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ðn þ 1ÞA ðn þ 1Þ2 A2 nA2 A pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi nA
k1;2 ¼ þ ¼ n þ 1 n2 þ 6n þ 1 ; k3 ¼ :
4a 16a2 4a2 4a 2a
ð21Þ
In this case inner equilibrium in joined network will be
2aeð1 2aÞ eð1 2aÞðnA þ 2aÞ

k1 ¼ k2 ¼ ; k3 ¼ : ð22Þ
nA2 þ 2aðn þ 1ÞA 4a2 nA2 þ 2aðn þ 1ÞA 4a2
Let us find the eigenvectors

8 nA
>
< k x1 þ 2a
A
x3 ¼ 0;
nA
2a
2a k x2 þ x3 ¼ 0; ð23Þ
A
>
: nA A2a
2a x1 þ 2a x2 þ 2a k x3 ¼ 0:
nA
ðknA
2a Þx
Thus, if k 6¼ nA
2a , then x1 ¼ x2 ¼ x, x3 ¼ A , and if k ¼ nA
2a , then x3 ¼ 0, x1 ¼ x2 .
2a
Hence we may choose
0 1
1
e1 ¼ @ A
p1ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ; ð24Þ
1
2 1 n n 2 þ 6n þ 1
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
corresponding to k1 ¼ 4a
A
n þ 1 n2 þ 6n þ 1
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi nA
A
nþ1 n2 þ 6n þ 1 2a
x3 ¼ 4a
A
¼
2a
(we supposed x ¼ x1 ¼ x2 ¼ 1)
1 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
¼ n þ 1 n2 þ 6n þ 1 n ¼ 1 n n2 þ 6n þ 1 ;
2 2
0 1
1
e2 ¼ @ A
p1ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ; ð25Þ
1
2 1 n þ n 2 þ 6n þ 1
corresponding to k2 ¼ 4a
A
n þ 1 þ n2 þ 6n þ 1 ,
0 1
1
e3 ¼ @ 1 A; ð26Þ
0
corresponding to k3 ¼ nA
2a .
Thus, dynamics in joined network is described with following vector equation
0 1 0 1
k1t k1
@ k2t A ¼ C1 kt e1 þ C2 kt e2 þ C3 kt e3 þ @ k2 A ð27Þ
1 2 3
k3t k3
The constants C1 , C2 , C3 can be found from initial conditions. Before unification the
both networks were in symmetric inner equilibria:
eð1 2aÞ eð1 2aÞ

k10 ¼ k30 ¼ ; k20 ¼ : ð28Þ
ðn þ 1ÞA 2a nA 2a
Hence by n ¼ 0 we receive the following equations

8 eð12aÞ 2aeð12aÞ
>
> ¼ C1 þ C2 C3 þ ;
>
>
ðn þ 1ÞA2a nA2 þ 2aðn þ 1Þ4a2
>
> eð12aÞ ð12aÞ
< ¼ C1 þ C2 þ C3 þ nA2 þ2ae
nA2a 2aðn þ 1Þ4a2 ;
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð29Þ
>
> eð12aÞ
ðn þ 1ÞA2a ¼ 2 1 n n þ 6n þ 1 C1 þ 2 1 n þ n þ 6n þ 1 C2 þ
1 2 1 2
>
>
>
>
: þ eð12aÞðnA þ 2aÞ :
nA2 þ 2aðn þ 1ÞA4a2
We add first two equations of the previous system. We receive the following system of
two equations to define C1 and C2 :
8
>
> C1 þ C2 ¼ 2e½ðð12a Þ½ð2n þ 1ÞA4a 2aeð12aÞ
n þ 1ÞA2aðnA2aÞ nA2 þ 2aðn þ 1Þ4a2 [ 0;
< pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1
1 n n2 þ 6n þ 1 C1 þ 12 1 n þ n2 þ 6n þ 1 C2 ¼ ð30Þ
>
>
2
:
¼ ðneþð12a Þ eð12aÞðnA þ 2aÞ
1ÞA2a nA2 þ 2aðn þ 1ÞA4a2 \0
It is easy to check, that the right side of the first equation is positive, and the right side
of the second equation is negative. It is clear, that k2 is the largest eigenvalue in
absolute and it is positive as all the components of eigenvector e2 : Hence, the nature of
transitional process will be determined with the sign of constant C2 . Further, the sign of
C2 , is defined by sign of the following expression
~ 2 ¼ 2 k0 k þ
D n 2 þ 6n þ 1 þ n 1 k 0 þ k 0 k k ; ð31Þ
3 3 1 2 1 2
where k10 ; k20 ; k30 – initial investment values.

~ 2 is positive, then after transition process the network pass to corner equilib-
If D
rium, where all the agents are hyperactive ðki ¼ eÞ. If D ~ 2 is negative, then after tran-
sition process the network pass to corner equilibrium, where all the agents are passive
ðki ¼ 0Þ. It is easy to see that in considering case when both networks were in inner
equilibria with any parameters if n [ 1, then D ~ 2 [ 0 and the network pass to corner
equilibrium, where all the agents are hyperactive. If n ¼ 1 (dyada connects to single
agent), then the nature of network behavior in transitional process depends on ratio of
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
pffiffiffi
A and a. Namely, if Aa 96 þ 32 2 4, then the network pass to corner equilib-
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
pffiffiffi
rium, where all the agents are hyperactive. If Aa \ 96 þ 32 2 4, then the network
pass to corner equilibrium, where all the agents are passive.
So it is obvious (with Kramer formulas), that C1 [ 0. Comparing first equation of
system (29) and first equation of system (30) it is obvious that C3 [ 0.
Let us check that corner solution, where k1 ¼ k2 ¼ k3 ¼ e, is equilibrium:
1. D1 V1 ðk1 ; K1 Þjk1 ¼k2 ¼k3 ¼e ¼ 2ae þ Ane eð1 2aÞ ¼ Ane e 0, if A 1n,
2. D1 V2 ðk2 ; K2 Þjk1 ¼k2 ¼k3 ¼e ¼ 2ae þ Ane eð1 2aÞ ¼ Ane e 0, if A 1n,
3. D1 V3 ðk3 ; K3 Þjk1 ¼k2 ¼k3 ¼e ¼ 2ae þ A2ne eð1 2aÞ ¼ 2Ane e 0, if A 1
2n,
corresponding to Corollary 2.4 in [5].
6 Conclusion
In this paper, we have described the process of change in game equilibrium during
graphs unification using dynamics model. We have highlighted the significance of the
productivity role that influences the agents’ behavior. Moreover, we determined the
importance of networks sizes, which also effect on agents’ decisions that they take
during unification process.
We believe that this article offers the base model of game equilibria change that can
be improved with increasing the amount of parameters and modification graphs’ type to
incomplete nets or non-oriented graphs.
Acknowledgement. The research is supported by the Russian Foundation for Basic Research
(project 17-06-00618).
References
1. Alcacer, J.: Chung W: Location strategies and knowledge spillovers. Manage. Sci. 53(5),
760–776 (2007)
2. Bresch, S., Lissoni, F.: Knowledge spillovers and local innovation systems: a critical survey.
Ind. Corp. Change 10(4), 975–1005 (2001)
3. Chung, W., Alcácer, J.: Knowledge seeking and location choice of foreign direct investment
in the United States. Manage. Sci. 48(12), 1534–1554 (2002)
4. Cooke, P.: Regional innovation systems, clusters, and the knowledge economy. Ind.
Corp. Change 10(4), 945–974 (2001)
5. Jaffe, A.B., Trajtenberg, M.: Henderson, R: Geographic localization of knowledge spillovers
as evidenced by patent citations. Q. J. Econ. 108(3), 577–598 (1993)
6. Katz, M.L.: Shapiro, C: Network externalities, competition, and compatibility. Am. Econ.
Rev. 75(3), 424–440 (1985)
7. Matveenko, V.D., Korolev, A.: V: Network game with production and knowledge
externalities. Contrib. Game Theory Manag. 8, 199–222 (2015)
8. Matveenko, V.D., Korolev, A.: V: Knowledge externalities and production in network: game
equilibria, types of nodes, network formation. Int. J. Comput. Econ. Econ. 7(4), 323–358
(2017)
9. Matveenko, V., Korolev, A.: Zhdanova, M: Game equilibria and unification dynamics in
networks with heterogeneous agents. Int. J. Eng. Bus. Manag. 9, 1–17 (2017)
Local Search Approaches with Different
Problem-Specific Steps for Sensor
Network Coverage Optimization
Krzysztof Trojanowski and Artur Mikitiuk(B)
Cardinal Stefan Wyszyński University, Warsaw, Poland

{k.trojanowski,a.mikitiuk}@uksw.edu.pl
Abstract. In this paper, we study relative performance of local search

methods used for the Maximum Lifetime Coverage Problem (MLCP)
solving. We consider nine algorithms obtained by swapping problem-
specific major steps between three local search algorithms we proposed
earlier: LSHMA , LSCAIA , and LSRFTA . A large set of tests carried out with
the benchmark data set SCP1 showed that the algorithm based on the
hypergraph model approach (HMA) is the most effective. The remaining
results of other algorithms divide them into two groups: effective ones,
and weak ones. The findings expose the strengths and weaknesses of the
problem-specific steps applied in the local search methods.
Keywords: Maximum lifetime coverage problem · Local search ·

Perturbation operators
1 Introduction
Wireless sensor networks are subject to many research projects where an impor-
tant issue is a maximization of time when the system fulfills its tasks, namely
the network lifetime. When a set of immobile sensors with a limited battery
capacity is randomly distributed over an area to monitor a set of points of inter-
est (POI) and the number of sensors is significant, for the majority of sensors,
their monitoring ranges overlap. Moreover, not all POIs must be monitored all
the time. In many applications, it is sufficient to control at any given time 80 or
90% of POIs. This percentage of POIs is called the required level of coverage.
Thus, not all sensors must be active all the time. Turning off some of them saves
their energy and allows to extend the network lifetime.
In this paper, we study relative performance of local search strategies used
to solve the problem of the network lifetime maximization. Such an approach
consists of three major problem–specific steps: finding any possible schedule,
that is, an initial problem solution, using a perturbation procedure to obtain
its neighbor, i.e., a solution close to the original one, and refining this neighbor
solution. If the refined neighbor is better than its ancestor, it takes the place

https://doi.org/10.1007/978-3-030-21803-4_41
408 K. Trojanowski and A. Mikitiuk
of the current best–found schedule. The algorithm repeats the steps of neighbor
generation and replacement until some termination condition is satisfied.
Earlier, in [8–10] we have proposed three local search algorithms to solve
the problem in question. Each of these algorithms employs a different method
to obtain an initial solution and a different perturbation procedure to get a
neighbor solution. Moreover, each of these algorithms refines the neighbor with
the method used to get the initial solution. Due to the regular and universal
structure of the three optimization algorithms, one can easily create new ones
by swapping selected problem–specific steps between them. In this paper, we
construct a group of local search algorithms based on the three proposed earlier.
We also evaluate their performance experimentally.
The paper is organized as follows. Related work is briefly discussed in Sect. 2.
Section 3 defines the Maximum Lifetime Coverage Problem (MLCP) formally.
The local search approach is introduced in Sect. 4. Section 5 describes our exper-
iments with local search algorithms for MLCP. Our conclusions are given in
Sect. 6.
2 Related Work
The problem of maximization of the sensor network lifetime has been intensively
studied for the last two decades. There are many variants of this problem, and
various strategies have been employed to solve them. More about these strategies
can be found in, e.g., the monograph [11] on the subject. More recently, modern
heuristic techniques have also been applied to this problem. One can find, e.g.,
papers on schedule optimization based on evolutionary algorithms [1,4], local
search with problem–specific perturbation operators [8–10], simulated anneal-
ing [5,7], particle swarm optimization [13], whale algorithm [12], graph cellular
automata [6] or other problem–specific heuristics [2,3,14].
3 Maximum Lifetime Coverage Problem (MLCP)
We assume that NS immobile sensors with a limited battery capacity are ran-
domly deployed over an area to monitor NP points of interest (POI). All sensors
have the same sensing range rsens and the same fully charged batteries. We use
a discrete-time model where a sensor is either active or asleep during every time
slot. Every sensor consumes one unit of energy per time unit for its activity while
in a sleeping state the energy consumption is negligible. Battery capacity allows
the sensor to be active during Tbatt time steps (consecutive, or not). The assump-
tions mentioned above give a simplified model, of course. In real life, effective
battery capacity depends on various factors, such as temperature, and is hard
to predict. Frequent turnings on and off the battery shorten its lifetime. In this
research, we omit such problems and assume that neither the temperature nor
the sensor activity schedule influences the battery parameters. An active sensor
monitors all POIs within its sensing range. We assume that every POI can be
Local Search Approaches with Different Problem-Specific Steps 409
monitored by at least one sensor. For effective monitoring, it is sufficient to con-

trol just a percentage cov of all POIs (usually 80–90%). To avoid redundancy in
coverage of POIs, we want the percentage of POIs being monitored at any given
time not to exceed cov by more than a tolerance factor δ (usually 2–5%).
In the real world, sensors also need to communicate with each other about
their findings and turning off some sensors can affect this process. A sensor which
monitoring area overlaps areas of its neighbor sensors may be still necessary
but to ensure connectivity in the communication graph. However, in the model
under consideration, communication between sensors is not a problem – energy
consumption for communication is negligible and active sensors are always able
to communicate, even if some sensors are in a sleeping state.
Summing up, we aim to provide a schedule of sensor activity giving a sufficient
level of coverage in every time step as long as possible. Problems from this class
are called the Maximum Lifetime Coverage Problems (MLCP).
4 General Scheme of the Local Search

In [8–10] three algorithms have been proposed: LSHMA , LSCAIA and LSRFTA .
Each of them is based on the same schema given in Algorithm 1. They differ
from each other in three steps: the initialization step (line 1), and two steps
being main parts of the neighborhood function (lines 3 and 4). This function
perturbs the current schedule to obtain its neighbor (line 3) and then refines the
neighbor (line 4) hoping to get a better schedule than the current one.
Algorithm 1. Local Search (for the schedule maximization context)

1: Initialize x ∈ D Step #1: generate an initial solution x
2: repeat
3: x = modify(x) Step #2: create a neighbor – modification of x
4: x” = refine(x ) Step #3: create a neighbor – improvement or repair of x
5: if F (x”) > F (x) then
6: x = x” The current solution is replaced by its neighbor
7: until termination condition met
8: return x
When step #1 is over, and the initialization procedure returns a new sched-
ule, in almost every case a small set of sensors retains a little energy in their
batteries. Even if we turn them all on, they will not provide a sufficient level
of POI coverage. Therefore, no feasible slot can be created using these sensors.
Perturbation operators make use of this set.
A perturbation operator builds a neighbor schedule in two steps (lines 3 and
4). First, the operator modifies the input schedule to make the set of available
working sensors larger. In the second step, it builds slots based on these sensors.
Eventually, the new list of slots should be longer or at least as long as the list
in the input schedule.
The first step of the perturbation operator may follow two opposite strategies.
In the first one, the operator turns off selected sensors in the schedule. In [8] a
single slot is chosen randomly and removed entirely from the schedule. Thus, all
the active sensors from this slot recover one portion of energy. In [10], for each
of the slots the sensors to be off are chosen randomly with a given probability.
Simulations show that even for a minimal probability like, for example, 0.0005
the number of slots with unsatisfied coverage level is much larger than one when
the procedure is over. Therefore, this perturbation is much stronger than the
previous one because all such invalid slots are removed immediately from the
schedule.
The second strategy [9] starts with activation of sensors from the pool of
the remaining working sensors in random slots of the schedule. For each of the
selected slots we draw a sensor from the pool randomly, but for faster improve-
ment, we activate only sensors which increase the level of coverage in the slot.
Precisely, for selected slots, we choose a sensor randomly from the pool and then
check if its coverage of POI is fully redundant with any of sensors already active
in this slot. If yes, activation of this sensor is pointless because the slot coverage
level does not change. In this case, the selected sensor goes back to the pool,
and we try the same procedure of sensor selection and activation with the next
slot. When the pool is empty, that is, all the remaining working sensors have
been activated, in the modified slots we fit the sets of active sensors to have
the coverage level just a bit above the requested threshold. Saved sensors retain
energy and participate in a new set of working ones.
In the second step of the perturbation operator, we assume that the new set
of working sensors is large enough to provide a satisfying level of coverage, so
we apply the initialization procedure from step #1. The procedure creates slots
one by one and decreases the energy in batteries at the same time according to
the sensor activities defined in subsequent new slots. This scheme is the same in
HMA, RFTA, and CAIA (albeit, they differ in details). Hence, the non-empty
schedule and the set of working sensors may successfully represent input data
for the initialization procedure called in the second step of the perturbation
operator.
Eventually, we get three versions of proceedings for each of the three steps.
When we swap these versions between LS algorithms, we can obtain twenty
seven versions of LS. Let’s call these versions of LS according to the origin of
the three steps. For example, the notation [HM A, HM A, HM A] represents a
Local Search algorithm, where all the three steps are like in LSHMA , that is, it
is the original, unmodified version of LSHMA . [HM A, RF T A, HM A] represents
the case where the initialization and the refine steps come from LSHMA , but the
modification – from LSRFTA .
5 Experiments
The experimental part of the research consists of experiments with new versions
of LS. For fair comparisons, all the tested versions of LS should start with the
same initial solutions. Low quality of the initial schedules creates an opportunity
to show the efficiency of the compared optimization algorithms. HMA returns
the most extended schedules which are hard to improve; therefore it is not taken
into account. From the remaining two procedures, we selected CAIA to generate
initial schedules for each of the problem instances. CAIA represents the initial-
ization step of compared LS versions in every case, and what is more important,
the main loops of the algorithms begin optimization from the same starting
points assigned to the instances. Thus, just the main loop, that is, precisely the
perturbation operator of the algorithm may vary in the subsequent versions of
LS. So, in the further text, we label the LS versions accordingly to the construc-
tion of just the perturbation operator and the symbol for the method used in
the initialization procedure is omitted. The full list of considered versions of LS
is as follows: [HM A, HM A], [HM A, RF T A], [HM A, CAIA], [RF T A, HM A],
[RF T A, RF T A], [RF T A, CAIA], [CAIA, HM A], [CAIA, RF T A], and [CAIA,
CAIA]. In every case, the loop has a limit of 500 iterations.
For fair evaluation of the algorithm efficiency, we should compare lengths of
obtained schedules with the optimal schedules for each of the instances. Unfor-
tunately, optimal solutions of the instances are unknown, and the complexity of
these problems makes them impossible to solve by an exhaustive search in a rea-
sonable time. Therefore, to obtain sub-optimal schedules, we did a set of experi-
ments with different versions of LS for all instances. All these versions employed
HMA as the initialization step hoping that in this way we maximize chances to
get solutions in close vicinity of the optimum. Lengths of best-obtained sched-
ules represent reference values in further evaluations of the percentage quality
of schedules.
5.1 Benchmark SCP1
For our experiments, we used a set of eight test cases SCP1 proposed earlier [8–
10]. In all cases, there are 2000 sensors with the sensing range rsens one unit
(this is an abstract unit, not one of the standard units of the length). In these
test cases, POIs form nodes of a rectangular or a triangular grid. The area under
consideration is a square with possible side sizes: 13, 16, 19, 22, 25, and 28 units.
The distance between POIs grows together with the side size of the area. This
gives us similar numbers of POIs in all test cases. The POIs distribution should
not be regular; therefore, the number of about 20% of nodes in the grid does not
get its POIs. A grid node has a POI only if a randomly generated value from
the range [0, 1] is less than 0.8. Thus, instances of the same test case differ in
the number of POIs from 199 to 240 for the triangular grid and from 166 to
221 for the rectangular grid. Either a random generator or a Halton generator is
the source of the sensor localization coordinates. For every test case, a set of 40
instances has been generated. The reader is referred to [8–10] for a more detailed
description of the benchmark SCP1. In our experiments, we have assumed cov
80% and δ 5%.
5.2 Overall Mean Percentage Quality

Schedule length is the primary output parameter of the experiments. However,
the optimal schedule lengths may differ for subsequent instances, so, the straight
comparisons of the schedule lengths may be misleading. Therefore, for each of
the schedules returned by LS, we calculated its percentage quality respectively to
the best-known schedule lengths. This kind of normalization allows comparing
the efficiency of LS versions over different classes of problems. Table 1 shows
mean, min and max percentage qualities of the best-found schedules returned
by the LS algorithms for each of the five values of Tbatt from 10 to 30.
Table 1. Mean, min and max percentage qualities of the best-found schedules returned
by the LS algorithms for each of the five values of Tbatt from 10 to 30. Codes in
column headers: C – CAIA, H – HMA, R – RFTA, e.g., HR represents the version
[HM A, RF T A]; init – qualities of the initial schedules generated by CAIA
Tbatt Init HH HR CH HC RH CR RR CC RC
10 Mean 53.94 96.75 94.55 94.51 93.56 57.37 55.18 54.70 54.55 54.49
Min 52.18 94.21 92.26 91.51 89.59 54.72 53.36 52.87 52.79 52.80
Max 56.04 98.72 96.73 98.10 96.23 61.35 57.07 56.74 56.60 56.60
15 Mean 53.99 97.01 94.77 94.31 93.84 56.87 55.44 54.70 54.50 54.46
Min 52.35 95.01 92.56 91.26 90.72 54.41 53.49 52.98 52.83 52.79
Max 55.96 99.02 96.93 97.37 96.42 59.79 57.28 56.65 56.47 56.38
20 Mean 53.89 96.95 94.76 94.08 93.83 56.30 55.60 54.56 54.32 54.28
Min 52.37 94.60 92.70 91.01 91.15 54.16 53.92 53.07 52.83 52.73
Max 56.04 98.91 96.77 97.14 96.06 59.04 57.80 56.68 56.40 56.37
25 Mean 54.10 97.13 94.92 94.22 94.06 56.34 56.05 54.78 54.48 54.44
Min 52.37 94.98 92.76 91.13 91.08 54.18 54.14 53.01 52.75 52.73
Max 55.87 98.95 96.92 97.03 96.14 58.62 58.13 56.65 56.22 56.20
30 Mean 53.86 97.02 94.73 93.69 93.80 55.86 56.07 54.55 54.21 54.17
Min 52.29 95.19 92.77 90.91 91.31 53.68 54.29 52.93 52.58 52.55
Max 55.85 98.99 97.00 96.77 96.19 58.33 58.17 56.59 56.24 56.20
Concerning effectiveness, one can divide the LS versions into three groups:
weak ones, effective ones, and the master approach which is [HM A, HM A].
The group of effective ones consists of [HM A, RF T A], [CAIA, HM A] and
[HM A, CAIA]. The remaining LS versions belong to the group of weak ones.
Thus, all three approaches using the perturbation method from HMA result in
obtaining a relatively good schedule, no matter what algorithm is used in the
last step to repair or refine the schedule.
One could ask why [CAIA, HM A] is an effective approach while
[RF T A, HM A] is a weak one. The probable reason is that the perturbation
operator used in CAIA removes from the original schedule many more slots
than the one used in RFTA. Thus, the number of sensors available again is
higher, and they have more energy in batteries than in the case of RFTA per-
turbation operator. Having a broader set of available sensors (or even the same
set but having more power), an efficient algorithm HMA can extend a shorter
input schedule much more than in the opposite case when it gets on input a
longer input schedule and a smaller set of available sensors obtained from the
perturbation used by RFTA.
5.3 Lengths of Schedules
Mean, min and max lengths of schedules returned by the best representatives of
the three groups are presented in Tables 2 – [HM A, HM A], 3 – [HM A, RF T A],
4 – [RF T A, HM A]. One can see that the individual results in Table 2 are often
even 5–7% better than the corresponding results in Table 3. However, in some
cases results in Table 3 are slightly (less than 1%) better than those in Table 2.
The individual results in Table 4 are always much worse than the corresponding
Table 2. Mean, min and max lengths of schedules returned by the version
[HM A, HM A] for each of the eight test cases in SCP1 and for five values of Tbatt
from 10 to 30
No Tbatt Mean Min Max No Tbatt Mean Min Max

1 10 357.02 348 366 5 10 173.20 167 180
15 537.92 527 551 15 261.68 250 271
20 718.58 703 734 20 350.55 340 361
25 899.67 880 923 25 438.10 425 453
30 1080.38 1037 1103 30 529.73 510 548
2 10 372.70 366 379 6 10 139.80 136 142
15 561.77 551 576 15 211.40 208 215
20 749.17 735 766 20 282.65 272 287
25 937.00 919 956 25 354.10 346 361
30 1127.63 1111 1150 30 425.93 419 433
3 10 245.97 242 251 7 10 101.50 97 108
15 370.52 364 376 15 154.05 149 163
20 494.77 486 502 20 206.63 198 218
25 619.40 606 630 25 258.73 249 274
30 744.48 730 755 30 311.20 299 328
4 10 162.00 150 171 8 10 90.15 89 93
15 244.75 236 257 15 135.95 134 140
20 328.52 300 346 20 182.60 177 187
25 409.60 375 432 25 227.93 222 234
30 494.65 477 518 30 274.52 269 281
[HM A, RF T A] for each of the eight test cases in SCP1 and for five values of Tbatt
from 10 to 30

1 10 336.43 325 344 5 10 171.57 167 176
15 506.68 494 524 15 259.55 250 267
20 676.67 661 691 20 347.05 340 355
25 846.23 819 870 25 436.15 425 448
30 1014.00 980 1043 30 525.33 510 539
2 10 349.50 337 360 6 10 138.15 135 141
15 524.92 502 542 15 208.55 206 213
20 700.23 667 718 20 279.25 274 283
25 874.40 852 901 25 349.90 345 355
30 1049.83 1023 1091 30 420.27 414 428
3 10 234.32 226 239 7 10 101.85 99 108
15 353.05 341 360 15 154.95 149 163
20 472.00 464 480 20 207.75 199 219
25 590.20 578 598 25 261.40 249 275
30 708.75 689 722 30 317.43 299 330
4 10 161.82 150 169 8 10 90.58 89 93
15 244.50 225 254 15 136.43 134 140
20 329.85 319 343 20 182.70 179 187
25 414.35 388 432 25 228.57 224 234
30 499.55 450 519 30 275.13 269 281
results from the previous two tables. Thus, comparison of absolute results for
individual test cases confirms observations from Sect. 5.2 concerning overall mean
relative quality of the schedules generated by particular approaches.
6 Conclusions
In this paper, we study relative performance of local search algorithms used to

solve MLCP. The local search method has two problem–specific steps: generation
of the initial solution and perturbation of a solution applied for generation of its
neighbor. In our case, the perturbation step consists of two substeps: obtaining a
neighbor schedule, and refining or repairing this neighbor schedule. Eventually,
the three problem–specific steps are necessary to adapt the general scheme of
local search to the Maximum Lifetime Coverage Problem space.
The starting point in our research were three LS algorithms we have proposed
earlier: LSHMA , LSCAIA , and LSRFTA . Each of them contained its own versions of
the three problem–specific steps. We swapped the steps between the algorithms
[RF T A, HM A] for each of the eight test cases in SCP1 and for five values of Tbatt
from 10 to 30

1 10 211.03 204 219 5 10 101.20 95 105
15 315.98 307 324 15 150.80 144 156
20 419.77 406 430 20 200.00 193 209
25 524.67 510 542 25 249.43 242 261
30 629.45 608 647 30 298.27 287 311
2 10 218.95 212 226 6 10 81.88 78 86
15 326.55 318 336 15 121.17 115 126
20 435.45 427 444 20 160.60 157 166
25 543.80 532 555 25 199.85 196 207
30 651.25 640 668 30 239.28 230 247
3 10 143.97 139 148 7 10 61.90 57 66
15 214.38 209 221 15 92.10 87 99
20 285.13 278 291 20 121.50 115 129
25 356.18 347 363 25 151.57 143 157
30 426.15 416 437 30 181.40 172 190
4 10 95.53 89 108 8 10 54.50 52 58
15 141.82 136 151 15 81.38 78 86
20 187.53 178 199 20 107.03 103 111
25 233.82 220 245 25 133.40 130 137
30 278.57 266 292 30 159.22 154 166
and this way we obtained new versions of the local search algorithms. In the set
of experiments, we compared the efficiency of these new versions.
In our experiments, we generated an initial schedule using CAIA, and we tried
perturbation methods and refinement/repair methods from all three approaches.
We used benchmark data set SCP1 which we proposed in our previous papers.
Our experiments have shown that the best pair of perturbation and refine-
ment/repair methods is the one used in LSHMA , i.e., [HM A, HM A]. Approaches
[HM A, RF T A], [CAIA, HM A], and [HM A, CAIA] are also effective while the
remaining combinations give much worse results.
References
1. Gil, J.M., Han, Y.H.: A target coverage scheduling scheme based on genetic algo-
rithms in directional sensor networks. Sensors (Basel, Switzerland) 11(2), 1888–
1906 (2011). https://doi.org/10.3390/s110201888
2. Keskin, M.E., Altinel, I.K., Aras, N., Ersoy, C.: Wireless sensor network lifetime
maximization by optimal sensor deployment, activity scheduling, data routing and
sink mobility. Ad Hoc Netw. 17, 18–36 (2014). https://doi.org/10.1016/j.adhoc.
2014.01.003
3. Roselin, J., Latha, P., Benitta, S.: Maximizing the wireless sensor networks life-
time through energy efficient connected coverage. Ad Hoc Netw. 62, 1–10 (2017).
https://doi.org/10.1016/j.adhoc.2017.04.001
4. Tretyakova, A., Seredynski, F.: Application of evolutionary algorithms to maximum
lifetime coverage problem in wireless sensor networks. In: IPDPS Workshops, pp.
445–453. IEEE (2013). https://doi.org/10.1109/IPDPSW.2013.96
5. Tretyakova, A., Seredynski, F.: Simulated annealing application to maximum life-
time coverage problem in wireless sensor networks. In: Global Conference on Arti-
ficial Intelligence, GCAI, vol. 36, pp. 296–311. EasyChair (2015)
6. Tretyakova, A., Seredynski, F., Bouvry, P.: Graph cellular automata approach to
the maximum lifetime coverage problem in wireless sensor networks. Simulation
92(2), 153–164 (2016). https://doi.org/10.1177/0037549715612579
7. Tretyakova, A., Seredynski, F., Guinand, F.: Heuristic and meta-heuristic
approaches for energy-efficient coverage-preserving protocols in wireless sensor net-
works. In: Proceedings of the 13th ACM Symposium on QoS and Security for Wire-
less and Mobile Networks, Q2SWinet’17, pp. 51–58. ACM (2017). https://doi.org/
10.1145/3132114.3132119
8. Trojanowski, K., Mikitiuk, A., Guinand, F., Wypych, M.: Heuristic optimization
of a sensor network lifetime under coverage constraint. In: Computational Col-
lective Intelligence: 9th International Conference, ICCCI 2017, Nicosia, Cyprus,
27–29 Sept 2017, Proceedings, Part I, LNCS, vol. 10448, pp. 422–432. Springer
International Publishing (2017). https://doi.org/10.1007/978-3-319-67074-4 41
9. Trojanowski, K., Mikitiuk, A., Kowalczyk, M.: Sensor network coverage problem: a
hypergraph model approach. In: Computational Collective Intelligence: 9th Inter-
national Conference, ICCCI 2017, Nicosia, Cyprus, 27–29 Sept 2017, Proceedings,
Part I, LNCS, vol. 10448, pp. 411–421. Springer International Publishing (2017).
https://doi.org/10.1007/978-3-319-67074-4 40
10. Trojanowski, K., Mikitiuk, A., Napiorkowski, K.J.M.: Application of local search
with perturbation inspired by cellular automata for heuristic optimization of sen-
sor network coverage problem. In: Parallel Processing and Applied Mathematics,
LNCS, vol. 10778, pp. 425–435. Springer International Publishing (2018). https://
doi.org/10.1007/978-3-319-78054-2 40
11. Wang, B.: Coverage Control in Sensor Networks. Computer Communications and
Networks. Springer (2010). https://doi.org/10.1007/978-1-84800-328-6
12. Wang, L., Wu, W., Qi, J., Jia, Z.: Wireless sensor network coverage optimization
based on whale group algorithm. Comput. Sci. Inf. Syst. 15(3), 569–583 (2018).
https://doi.org/10.2298/CSIS180103023W
13. Yile, W.U., Qing, H.E., Tongwei, X.U.: Application of improved adaptive parti-
cle swarm optimization algorithm in WSN coverage optimization. Chin. J. Sens.
Actuators (2016)
14. Zorbas, D., Glynos, D., Kotzanikolaou, P., Douligeris, C.: BGOP: an adaptive cov-
erage algorithm for wireless sensor networks. In: Proceedings of the 13th European
Wireless Conference, EW07 (2007)
Modelling Dynamic Programming-Based
Global Constraints in Constraint
Programming
Andrea Visentin1(B) , Steven D. Prestwich1 , Roberto Rossi2 , and Armagan

Tarim3
1
Insight Centre for Data Analytics, University College Cork, Cork, Ireland
andrea.visentin@insight-centre.org, s.prestwich@cs.ucc.ie
2
University of Edinburgh Business School, Edinburgh, UK
Roberto.Rossi@ed.ac.uk
3
Cork University Business School, University College Cork, Cork, Ireland
armagan.tarim@ucc.ie
Abstract. Dynamic Programming (DP) can solve many complex prob-

lems in polynomial or pseudo-polynomial time, and it is widely used in
Constraint Programming (CP) to implement powerful global constraints.
Implementing such constraints is a nontrivial task beyond the capability
of most CP users, who must rely on their CP solver to provide an appro-
priate global constraint library. This also limits the usefulness of generic
CP languages, some or all of whose solvers might not provide the required
constraints. A technique was recently introduced for directly modelling
DP in CP, which provides a way around this problem. However, no com-
parison of the technique with other approaches was made, and it was
missing a clear formalisation. In this paper we formalise the approach
and compare it with existing techniques on MiniZinc benchmark prob-
lems, including the flow formulation of DP in Integer Programming. We
further show how it can be improved by state reduction methods.
Keywords: Constraint programming · Dynamic programming · MIP ·

Encoding
1 Introduction
Constraint Programming (CP) is one of the most active fields in Artificial Intel-
ligence (AI). Designed to solve optimisation and decision problems, it provides
expressive modelling languages, development tools and global constraints. An
overview of the current status of CP and its challenges can be found in [9].
The Dynamic Programming (DP) approach builds an optimal solution by
breaking the problem down into subproblems and solving each to optimality in
a recursive manner, achieving great efficiency by solving each subproblem once
only.
https://doi.org/10.1007/978-3-030-21803-4_42
418 A. Visentin et al.
There are several interesting connections between CP and DP:
– DP has been used to implement several efficient global constraints within CP

systems, for example [10,17]. A tag for DP approaches in CP is available in
the global constraint catalogue [1].
– [8] used CP to model a DP relaxed version of the TSP after a state space
reduction.
– The DP feature of solving each subproblem once only has been emulated in
CP [5], in Constraint Logic Programming languages including Picat [19], by
remembering the results of subtree searches via the technique of memoization
(or tabling) which remembers results so that they need not be recomputed.
This can improve search performance by several orders of magnitude.
– DP approaches are widely used in binary decision diagrams and multi-value
decision diagrams [3].
Until recently there was no standard procedure to encode a DP model into CP.
If part of a problem required a DP-based constraint that is not provided by the
solver being used, the modeller was forced either to write the global constraint
manually, or to change solver. This restricts the usefulness of DP in CP.
However, a new connection between CP and DP was recently defined. [16]
introduced a technique that allows DP to be seamlessly integrated within CP:
given a DP model, states are mapped to CP variables while seed values and
recurrence equations are mapped to constraints. The resulting model is called a
dynamic program encoding (DPE). Using a DPE, a DP model can be solved by
pure constraint propagation without search. DPEs can form part of a larger CP
model, and provide a general way for CP users to implement DP-based global
constraints. In this paper we explore DPEs further:
– We provide a formalization of the DPE that allows a one-to-one correspon-

dence with a generic DP approach.
– We compare the DPE with a widely known variable redefinition technique
for modelling DP in Mixed Integer Programming (MIP) [13], and show its
superior performance.
– We show that the performance of a DPE can be further improved by the
application of state reduction techniques.
– We show that is possible to utilize it to model some DP-based constraints in
MiniZinc.
The paper is organised as followed. Section 2 formalises the DPE technique to

allow a one-to-one mapping of the DP approaches. Section 3 applies DPE to the
shortest path problem in MiniZinc. We study its application to the knapsack
problem and show how it can strongly improve the way we represent DP in CP,
and how to use state reduction techniques to improve performance. Section 4
concludes the paper and discusses when this technique should be used.
Modelling Dynamic Programming-Based Global Constraints 419
2 Method
In this section we formalize the DPE. As mentioned above, it models every DP
state with a CP variable, and the seed values and recurrence relations with
constraints. [16] introduces the technique informally, and here we give a more
formal description based on the definition of DP given in [4].
Many problems can be solved with a DP approach that can be modelled as a
shortest path on a DAG, for example the knapsack problem [11] or the lot sizing
problem [7]. We decided to directly use the shortest path problem, which was
already used as a benchmark for the MiniZinc challenge [18]. One of the most
famous DP-like algorithms is used to solve this problem: Dijkstra’s algorithm.
We will use Fig. 1 to help the visualization of the problem.
Fig. 1. Graph relative to a generic shortest path problem.
Most DPs can be described by their three most important characteristics:

stages, states and recursive optimization. The fundamental feature of the DP
approach is the structuring of optimization problems in stages, which are solved
sequentially. The solution of each stage helps to define the characteristics of the
next stage problem. In Fig. 1 the stages are represented in grey. In the DPE the
stages are simply represented as groups of states, and it is important that the
stages depends only on the next stage’s states. Since the graph is acyclic we
can divide the stages into group of nodes that can not access the nodes of the
previous stages, and can not be accessed by the nodes of the next stages.
To each stage of the problem are associated one or more states. These contain
enough information to make future decisions, without regard to how the process
reached the current state. In the DPE the states are represented by CP variables
and they contain the optimal value for the subproblem represented by that state.
In the graphical representation they are the nodes of the graph. In Dijkstra’s
algorithm these variables contain the length of the shortest path from that node
to the sink. We can identify two particular type of stages: the initial state that
contains the optimal solution for the whole problem and no other state solution
is based on its value; and base cases or final states that are the solution of the
smallest problems. The solutions for these problems do not depend on any other
state. In Fig. 1 they are represented by the source and the sink of the graph.
The last general characteristic of the DP approach is the recursive optimiza-
tion procedure. The goal of this procedure is to build the overall solution by
solving one stage at time and linking the optimal solution to each state to other
states of subsequent stages optimal solution. This procedure is generally based
on a backward induction process. The procedure has numerous components, that
can be generally modelled as a functional equation or Bellman equation [2] and
recursion order.
In the DPE the functional equation is not a single equation, but is applied to
every state via a constraint. This constraint contains an equality that binds the
optimal value obtainable to that states to the ones of the next stages involved.
For every state we have a set of feasible decisions that can lead to a state of
the next stage, which in the graph are represented by the edges leaving the
associated node: if used in the shortest path it means that decision is taken. In
the constraint is included also the immediate cost associated with each decision,
which is the value that is added to or subtracted from the next stage state
variables. In Fig. 1 these costs are represented by the weights of the involved
edges. In the shortest path problem, the constraint applied to each (non-sink)
state assigns to the node’s CP variable the minimum of the reachable with one
edge node’s CP variables, plus the edge cost.
The important difference between the encodings is the order in which the
states are explored and resolved. In DP they are ordered in such a way that each
state is evaluated only when all the subsequent stages are solved, while in the
encodings the ordering is delegated to the solvers. In the MIP flow formulation it
is completely replaced by a search on the binary variables, while in the DPE it is
done by constraint propagation, which depends on the CP solver implementation
of the propagators. This approach is more robust than search, which in the worst
case can spend an significant time exploring a search subtree. The optimality of
the solution is guaranteed by the correctness of the DP formulation.
3 Computational Results
We aim to make all our results replicable by other researchers. Our code is avail-
able online at: https://github.com/andvise/dpincp. We also decided to use only
open source libraries. In the first experiment we used MiniZincIDE 2.1.7, while
the second part is coded in Java 10. We used 3 CP solvers: Gecode 6.0.1, Google
OR-Tools 6.7.2 and Choco Solver 4.0.8. We used as MIP solvers: COIN-OR
branch-and-cut solver, IBM ILOG CPLEX 12.8 and Gurobi 8.0.1. All exper-
iments are executed on an Ubuntu system with an Intel i7-3610QM, 8 GB of
RAM and 15 GB of swap memory.
3.1 Shortest Path in MiniZinc
MiniZinc is a standard modelling language for CP. It provides a set of standard

constraints, with ways of decomposing all of them to be solved by a wide vari-
ety of solvers. To achieve standardization the MiniZinc constraints catalogue is
very limited. Only constraints that are available in all its solvers, or that can be
decomposed into simpler constraints, are included. The decomposition is gen-
erally done in a naive way, causing poor performance. This is true of all the
DP-based constraints in particular.
In this section we focus on applications of the DPE in MiniZinc. We aim
to apply this new technique to the shortest path problem and solve it with the
DPE of the Dijkstra algorithm we used as an example in the method section.
The shortest path was one of the benchmarks of the MiniZinc challenge [18]. The
current reduction is based on a flow formulation on the nodes of the graph, which
regulates the flow over each node and requires a binary variable for each edge
indicating whether an edge is used or not. This is the same encoding proposed
by [13].
Our implementation is based on Dijkstra’s algorithm: every decision variable
contains the shortest distance to the sink node. The formulation is shorter and
more intuitive than the previous one.
We compared the methods on the 10 available benchmark instances. We
used the MiniZincIDE and Gecode as solver, with 20 min as a time limit. Table 1
shows the results of the computations. When the flow formulation finds a good
or optimal solution quickly, the DPE is approximately twice as fast. However,
the flow formulation requires search that can take exponential time, and it is
unable to find a solution before timeout occurs. The most interesting result is
that, by using only constraint propagation, DPE performance is robust and only
marginally affected by the structure of the instances. In some cases, for example
instance 7, the flow formulation finds an optimal solution but takes a long time
to prove optimality, in which case the DPE is more than 4 orders of magnitude
faster.
Table 1. Time required to complete the computation of the 10 benchmark instances

in Gecode. ‘-’ represents a timeout.
CP solver 0 1 2 3 4 5 6 7 8 9
Dijkstra 23 ms 19 ms 18 ms 17 ms 24 ms 20 ms 25 ms 23 ms 20 ms 29 ms
Flow - 50 ms 60 ms 571 ms 46 ms - 47 ms 1 182 s 4 504 ms -
formulation
The DPE requires a smaller number of variables, since it requires only one
for each node. On the contrary, the flow formulation requires a variable for
each edge. This is without taking in account the number of additional variables
created during the decomposition.
The DPE cannot rival a state-of-the-art shortest path solver in terms of

performance, in the case of parameterised shortest path problems, in which the
costs of the edges are influenced by other constraints. However, the DPE allows
a more flexible model than a specific global constraint and a more efficient model
in MiniZinc.
We repeated the above experiment using a MIP solver instead of CP. Table 2
contains the results of the 10 instances solved using COIN-OR branch-and-cut
solver. Interestingly, the situation is inverted: the flow formulation performs
efficiently while the DPE fails to find an optimal solution in many cases. This
is due to the high number of auxiliary discrete variables needed by the MIP
decomposition of the min constraint. Because of this the DPE loses one of its
main strengths: DP computation by pure constraint propagation. Moreover the
MIP can take advantage of the unimodularity of the matrix, as mentioned before.
We therefore recommend the usual flow-based formulation for MIP and the DPE
for CP.
Table 2. Time required to complete the computation of the 10 benchmark instances

in CBC COIN. ‘-’ represents a timeout.
MIP solver 0 1 2 3 4 5 6 7 8 9
Dijkstra 375 s 64 ms - - - 20 667 ms 61 ms - 138 ms 303 ms
Flow 31 ms 39 ms 34 ms 40 ms 46 ms 35 ms 36 ms 40 ms 37 ms 53 ms
formulation
3.2 Knapsack Problem

We now apply the DPE to the knapsack problem [11] because it is a widely
known NP-hard problem, it has numerous extensions and applications, there is
a reduction in MiniZinc for this constraint, and it can be modelled with the
technique proposed by [13]. We consider the most common version in which
every item can be packed at most once, also known as the 0–1 knapsack problem
[15]. Research on this problem is particularly active [12] with many applications
and different approaches.
The problem consists of a set of items I with volumes v and profits p. The
items can be packed in a knapsack of capacity C. The objective is to maximize the
total profit of the packed items without exceeding the capacity of the knapsack.
The binary variables x represent the packing scheme, xi is equal to 1 if the item
is packed, 0 otherwise. The model is:
max x · p (1a)
s.t. x·v ≤C (1b)
x ∈ {0, 1} (1c)
This model can be directly implemented in CP or in MIP. To solve the binary
knapsack problem we the well known DP-like algorithm described in [11], refer
to the source for the full description of the algorithm. Utilizing the structure of
a DP approach in the previous section: J[i, j] with i ∈ I and j ∈ [0, C] are the
states of our DP. Each J[i, j] contains the optimal profit of packing the subset
of items Ii = (i, . . . , n) in a knapsack of volume j.
The formulation can be represented by a rooted DAG, in this case a tree with
node J[1, C] as root and nodes J[n, j], j ∈ [0, C] as leaves. For every internal
node J[i, j] the leaving arcs represent the action of packing the item i, and their
weight is the profit obtained by packing the i-th item. A path from the root to
a leaf is equivalent to a feasible packing, and the longest path of this graph is
the optimal solution. If we encode this model using a DPE, creating all the CP
variables representing the nodes of the graph, then it is solved by pure constraint
propagation with no backtracking.
We use this problem to show the potential for speeding up computational
times. With the DPE implementation we can use simple and well known tech-
niques to reduce the state space without compromising the optimality. For exam-
ple, if at state J[i, j] volume j is large enough to contain all items from i to n
(all the items I might pack in the next stages) then we know that the optimal
solution of J[i, j] will contain all of them, as their profit is a positive number.
This pruning can be made more effective by sorting the items in decreasing order
of size, so the pruning will occur closer to the root and further reduce the size
of the search space.
We test the DPE on different type of instances of increasing item set sizes,
and compare its performance with several other decompositions of the constraint:
– A CP model that uses the simple scalar product of the model (1a)–(1b)
(Naive CP). The MiniZinc encoding of the knapsack contraint uses the
same structure.
– The knapsack global constraint available in Choco (Global constraint).
This constraint is implemented with scalar products. The propagator uses
Dantzig-Wolfe relaxation [6].
– A CP formulation of the encoding proposed in this paper (DPE) solved using
Google OR.
– A DPE with state space reduction technique introduced before (DPE + sr).
– A DPE with state space reduction technique with the parts sorted, (DPE +
sr + sorting).
– The MIP flow formulation proposed by [13] (Flow model Solver) which we
tested with 3 different MIP solvers: COIN CBC, CPLEX and Gurobi.
To make the plots readable we decided to show only these solutions, but others
are available in our code.
As a benchmark we decided to use Pisinger’s described in [14]. We did not
use the code available online because it is not possible to set a seed and make the
experiments replicable. Four different type of instances are defined, in decreasing
correlation between items’s weight and profit order: subsetsum, strongly corre-
lated, weakly correlated and uncorrelated. Due to space limitation we leave the
reader finding the details of the instances in the original paper. In our experi-
ments we tested all the types, and we kept the same configuration of [14] first
set of experiments. We increased the size of the instances until all the DP encod-
ings were not finding the optimal solution before the time limit. A time limit
of 10 min was imposed on the MIP and CP solvers, including variable creation
overhead.
Fig. 2. Average computational time for: (a) subsetsum instances; and, (b) uncorrelated
instances.
Figure 2 shows the computational time in relation with the instance size. Due
to space limitations we had to limit the number of plots. We can see that DPE
clearly outperforms the naive formulation in CP or the previous encoding (flow
formulation) solved with an open source solver, CBC. Normal DPE solved with
an open source solver is computationally comparable to the flow formulation
implemented in CPLEX, and outperform the one solved by Gurobi in instances
where the correlation between weight and profit of the items is lower, even if
the commercial MIP solvers use parallel computations. The DPE outperforms
the variable redefinition technique in MIP, because of the absence of search. It
is also clearly better that a simple CP model of the problem definition, which
is the same model used for the MiniZinc constraint. The Choco constraint with
ad-hoc propagator outperforms the DPE in most of the cases, confirming that
a global constraint is faster than a DPE. A particular situation is the test on
the strongly correlated instances, in this case the global constraint fails to find
the optimal solution in many test instances even with a small number of items;
probably the particular structure of the problem makes the search get stucked
in some non optimal branches.
It is interesting to note the speed up from the space reduction technique. The
basic DPE can solve instances up to 200 items but it has a memory problem:
the state space grows so rapidly that a massive usage of the SWAP memory is
needed. However, this effect is less marked when a state reduction technique is
applied. This effect is stronger when the correlation between item profits and
volumes is stronger. The reduction technique improves considerably when we

increase the number of items needed to fill the bin, since the pruning occurs
earlier in the search tree: see Fig. 3.
Fig. 3. Computational time in the subsetsum instances with volume per item reduced
In the case that the constraint has to be called multiple times during the
solving of a bigger model, the DPE can outperform the pure constraint since
the overhead to create all the variables is not repeated. This experiment demon-
strates the potential of the DPE with state space reduction: even with a simple
and intuitive reduction technique we can solve instances 10 times bigger than
with a simple CP model. We can see that the behaviour of DPE is stable regard-
less of the type of the instance; on the contrary, the performance of the space
reduction technique strongly depends on the instance type and the volume of
the knapsack.
Of course we can not outperform a pure DP implementation, even if our
solution involves a similar number of operations. This is mainly due to the time
and space overhead of creating CP variables. In fact the DPE requires more time
to create the CP variables than to propagate the constraints.
4 Conclusions
In this paper we have analysed a recently proposed technique for mapping DP

into CP, called the dynamic program encoding (DPE), which takes advantage of
the optimality substructure of the DP. With a DPE the DP execution is achieved
by pure constraint propagation (without search or backtracking).
We provided a standard way to model a DP into DPE. We have demonstrated
the potential of the DPE in constraint modelling in several ways: we compared it
with another DP-encoding technique using CP and MIP solvers; we showed how
to use state reduction techniques to improved its performance; we showed that

it outperforms standard DP encoding techniques in the literature, and greatly
outperforms non-DP-based CP approaches to the knapsack problem; and we
applied the DPE to MiniZinc benchmarks, showing how its performance is faster
and more robust than existing CP techniques. We also showed a negative result:
the DPE is unsuitable for use in MIP, where standard methods are much better.
To recap the potential applications of the DPE, it can be used when: a DP-
based constraint is needed but also other constraints can affects states inside the
DP; when the respective DP global constraint is not implemented in the specific
solver; and when DP approaches are needed in MiniZinc as starting approach to
decompone more complex problems in simpler instructions.
Acknowledgments. This publication has emanated from research supported in part

by a research grant from Science Foundation Ireland (SFI) under Grant Number
SFI/12/RC/2289 which is co-funded under the European Regional Development Fund.
References
1. Beldiceanu, N., Carlsson, M., Rampon, J.X.: Global constraint catalog, (revision
a) (2012)
2. Bellman, R.: The theory of dynamic programming. Technical report, RAND Corp
Santa Monica CA (1954)
3. Bergman, D., Cire, A.A., van Hoeve, W.J., Hooker, J.N.: Discrete optimization
with decision diagrams. INFORMS J. Comput. 28(1), 47–66 (2016)
4. Bradley, S.P., Hax, A.C., Magnanti, T.L.: Applied Mathematical Programming.
Addison Wesley (1977)
5. Chu, G., Stuckey, P.J.: Minimizing the maximum number of open stacks by cus-
tomer search. In: International Conference on Principles and Practice of Constraint
Programming, pp. 242–257. Springer (2009)
6. Dantzig, G.B., Wolfe, P.: Decomposition principle for linear programs. Oper. Res.
8(1), 101–111 (1960)
7. Eppen, G.D., Martin, R.K.: Solving multi-item capacitated lot-sizing problems
using variable redefinition. Oper. Res. 35(6), 832–848 (1987)
8. Focacci, F., Milano, M.: Connections and integrations of dynamic programming
and constraint programming. In: CPAIOR 2001 (2001)
9. Freuder, E.C.: Progress towards the holy grail. Constraints 23(2), 158–171 (2018)
10. Malitsky, Y., Sellmann, M., van Hoeve, W.J.: Length-lex bounds consistency for
knapsack constraints. In: International Conference on Principles and Practice of
Constraint Programming, pp. 266–281. Springer (2008)
11. Martello, S.: Knapsack Problems: Algorithms and Computer Implementations.
Wiley-Interscience Series in Discrete Mathematics and Optimization (1990)
12. Martello, S., Pisinger, D., Toth, P.: New trends in exact algorithms for the 0–1
knapsack problem. Eur. J. Oper. Res. 123(2), 325–332 (2000)
13. Martin, R.K.: Generating alternative mixed-integer programming models using
variable redefinition. Oper. Res. 35(6), 820–831 (1987)
14. Pisinger, D.: A minimal algorithm for the 0–1 knapsack problem. Oper. Res. 45(5),
758–767 (1997)
15. Plateau, G., Nagih, A.: 0–1 knapsack problems. In: Paradigms of Combinatorial
Optimization: Problems and New Approaches, vol. 2, pp. 215–242 (2013)
16. Prestwich, S.D., Rossi, R., Tarim, S.A., Visentin, A.: Towards a closer integration
of dynamic programming and constraint programming. In: 4th Global Conference
on Artificial Intelligence (2018)
17. Quimper, C.G., Walsh, T.: Global grammar constraints. In: International Confer-
ence on Principles and Practice of Constraint Programming, pp. 751–755. Springer
(2006)
18. Stuckey, P.J., Feydy, T., Schutt, A., Tack, G., Fischer, J.: The minizinc challenge
2008–2013. AI Mag. 35(2), 55–60 (2014)
19. Zhou, N.F., Kjellerstrand, H., Fruhman, J.: Constraint Solving and Planning with
Picat. Springer (2015)
Modified Extended Cutting Plane
Algorithm for Mixed Integer Nonlinear
Programming
Wendel Melo1(B) , Marcia Fampa2 , and Fernanda Raupp3

1
College of Computer Science, Federal University of Uberlandia,
Uberlândia, Brazil
wendelmelo@ufu.br
2
Institute of Mathematics and COPPE, Federal University of Rio de Janeiro,
Janeiro, Brazil
fampa@cos.ufrj.br
3
National Laboratory for Scientific Computing (LNCC) of the Ministry of Science,
Technology and Innovation, Petrópolis, Brazil
fernanda@lncc.br
Abstract. In this work, we propose a modification on the Extended

Cutting Plane algorithm (ECP) that solves convex mixed integer non-
linear programming problems. Our approach, called Modified Extended
Cutting Plane (MECP), is inspired on the strategy of updating the set of
linearization points in the Outer Approximation algorithm (OA). Com-
putational results over a set of 343 test instances show the effectiveness of
the proposed method MECP, which outperforms ECP and is competitive
to OA.
Keywords: Mixed integer nonlinear programming ·

Extended cutting plane · Outer approximation
1 Introduction
In this work, we address the following convex Mixed Integer Nonlinear Program-
ming (MINLP) problem:
(P ) min f (x, y)
x, y
s. t. g(x, y) ≤ 0, (1)
x ∈ X, y ∈ Y ∩ Zny ,
where X and Y are polyhedral subsets of Rnx and Rny , respectively, Y is

bounded, f : Rnx +ny → R and g : Rnx +ny → Rm are convex and continuously
differentiable functions. We assume that problem (P ) has an optimal solution.
The difficulty involved in the solution of problem (P ), as well as its appli-
cability in diverse situations, justify the search for efficient algorithms for its
https://doi.org/10.1007/978-3-030-21803-4_43
Modified Extended Cutting Plane Algorithm for MINLP 429
resolution. In this context, several approaches have been proposed to solve

MINLP problems. (The interested reader can find good bibliographic review
in [2,5,8,13]). Among these different approaches, the algorithms which belong
to the class of linear approximation deserve special emphasis. This class of algo-
rithms solve convex MINLP problems approximating them by a sequence of
Mixed Integer Linear Programming (MILP) problems, whose solutions provide
lower bounds (in the minimization case) for the original problem addressed. Such
approximations are obtained through first order derivatives and are based on the
convexity of the functions in (P ). A good characteristic of these algorithms is
that they take advantage of the maturity achieved in the MILP area, with the
use of sophisticated computational packages.
A well known algorithm from the literature in the class of linear approx-
imation is the Outer Approximation (OA) algorithm [6,7]. At each iteration,
OA solves a MILP problem and one or two continuous Nonlinear Programming
(NLP) problems. Such scheme was developed to guarantee the convergence of OA
in a finite number of iterations and has shown to be efficient in several practical
situations. Another well known algorithm in the class of linear approximation is
the Extended Cutting Plane (ECP) algorithm [14]. The main difference between
OA and ECP is that ECP does not solve any NLP problem during its execution,
restricting itself to the solution of MILP problems. Although, at first, this seems
to be an advantage of ECP, this strategy leads to no guaranteed convergence of
the algorithm in a finite number of iterations. We have observed that, in many
cases, the ECP algorithm demands a bigger number of iterations than OA to
converge to the optimal solution of problem (P ), and thus, ECP requires more
computational effort than OA. Nevertheless, as the ECP algorithm does not need
to solve NLP problems, it has the advantage of not requiring the computation
of second order derivatives, or any approximation of them.
In this work, we propose an algorithm based on ECP. Our main contribu-
tion consists in a small modification in the ECP algorithm, making it more
similar to OA, hoping to reduce the number of necessary iterations to con-
verge, but keeping the friendly characteristic of ECP, of not solving any kind of
NLP problem, and so avoiding the computation of second order derivatives (or
approximations of them). Despite modest, our contribution, which we call Mod-
ified Extended Cutting Plane (MECP) algorithm, has shown promising results
for a set of 343 MINLP test instances, indicating that the proposed method is
competitive with OA.
This paper is organized as follows: Sect. 2 discusses about the ECP algo-
rithm, while algorithm OA is presented in Sect. 3. Our MECP approach is then
introduced in Sect. 4. Finally, Sect. 5 presents computational results comparing
MECP to OA and ECP, pointing some conclusions about the work developed.
2 The Extended Cutting Plane Algorithm

The Extended Cutting Plane algorithm was proposed in [14] and is based on the
approximation of (P ) by a Mixed Integer Linear Programming (MILP) problem,
430 W. Melo et al.
known as the master problem. To facilitate the understanding of the master

problem, we note that (P ) can be reformulated as a problem with linear objective
function with the use of an additional auxiliary variable α:
(P̄ ) min α
α,x,y
s. t. f (x, y) ≤ α (2)
g(x, y) ≤ 0
x ∈ X, y ∈ Y ∩ Zny .
Once the constraints of (P̄ ) are convex, when we linearize them by Taylor
series about any given point (x̄, ȳ) ∈ X × Y , we obtain the following valid
inequalities for (P̄ ):

x − x̄
∇f (x̄, ȳ)T + f (x̄, ȳ) ≤ α (3)
y − ȳ

T x − x̄
∇g(x̄, ȳ) + g(x̄, ȳ) ≤ 0 (4)
y − ȳ
Therefore, we generate the master problem from a set L = {(x1 , y 1 ), . . . ,

(x , y k )} with k linearization points, which is a relaxation of the original problem
k
(P ):
L
M min α
α,x,y
x − xj
s. t. ∇f (xj , y j )T j + f (xj , y j ) ≤ α, ∀(xj , y j ) ∈ L
y − y (5)
j j T x − xj j j j j
∇g(x , y ) + g(x , y ) ≤ 0, ∀(x , y ) ∈ L
y − yj
x ∈ X, y ∈ Y ∩ Zny .
Let (α̂, x̂, ŷ) be an optimal solution of (M L ). We emphasize that the value α̂
is a lower bound for (P̄ ) and (P ). If (α̂, x̂, ŷ) is feasible for (P̄ ), then the value
α̂ is also an upper bound for (P̄ ) and (P ). In this case, as (α̂, x̂, ŷ) gives the
same value as a lower and an upper bound for (P̄ ), this solution is optimal for
(P̄ ). Therefore, (x̂, ŷ) is also an optimal solution for (P ). On the other side, if
(α̂, x̂, ŷ) is not feasible for (P̄ ), it is necessary to add valid inequalities to (M L ),
to cut this solution out of its feasible set, strengthening the relaxation given by
this problem. To reach this goal, the ECP algorithm uses the strategy of adding
the solution (x̂, ŷ) to the set L.
The ECP algorithm is presented as Algorithm 1. We point out that ECP does
not require the solution of any NLP problem and does not use any information
from second order derivatives or approximations of them. This characteristic
can be advantageous in some cases, especially when the computation of second
order derivatives is hard or cannot be accomplished for some reason. We also
emphasize that the strategy of adding the solution (x̂k , ŷ k ) to the set L at the
end of each iteration (line 13) does not ensure that ECP has finite convergence.
Input: (P ): MINLP problem, c : convergence tolerance.

Output: (x∗ , y ∗ ): optimal solution for (P ).
1 z l = −∞;
2 z u = +∞;
3 Choose an initial linearization point (x0 , y 0 );
4 L = {(x0 , y 0 )};
5 k = 1;
6 Let (M L ) be the master problem constructed from (P ) over the points in L;
7 while z u − z l > c do
8 Let (α̂k , x̂k , ŷ k ) be an optimal solution of (M L );
9 z l = α̂k ;
10 if (x̂k , ŷ k ) is feasible for (P ) and f (x̂k , ŷ k ) < z u then
11 z u = f (x̂k , ŷ k );
12 (x∗ , y ∗ ) = (x̂k , ŷ k );
13 L = L ∪ (x̂k , ŷ k );
14 k = k + 1;
Algorithm 1. Extended Cutting Plane (ECP) algorithm.
We have observed that the new cuts generated are usually weak, which makes
the algorithm to require a large number of iterations to converge to an optimal
solution.
3 The Outer Approximation Algorithm

The Outer Approximation (OA) algorithm was proposed in [6,7]. Similarly to
ECP, the main foundation of OA is to adopt the master problem (M L ), and,
at each iteration, to add a new linearization point to L until the lower bound
given by (M L ) becomes sufficiently close to the best known upper bound for
(P ). Let (α̂k , x̂k , ŷ k ) be an optimal solution for (M L ) at iteration k. An attempt
to obtain an upper bound for (P ) is to solve problem (Pŷk ), which is the NLP
problem obtained from (P ) by fixing y at the value ŷ k :
(Pŷk ) min f (x, ŷ k )
x
s. t. g(x, ŷ k ) ≤ 0 (6)
x ∈ X.
If problem (Pŷk ) is feasible, let (x̃k , ŷ k ) be an optimal solution for the prob-
lem. In this case, f (x̃k , ŷ k ) is an upper bound for (P ) and (P̄ ), and the point
(x̃k , ŷ k ) is added to the set L. In case (Pŷk ) is infeasible, then the following
feasibility problem is solved:
m
(PŷFk ) min i=1 ui
u,x
s. t. g(x, ŷ k ) ≤ u (7)
u ≥ 0,
x ∈ X, u ∈ Rm .
432 W. Melo et al.
Let (ǔk , x̌k ) be an optimal solution for (PŷFk ). The point (x̌k , ŷ k ) is then added
to the set L. After the update of L, with the addition of (x̃k , ŷ k ), if problem (Pŷk )
is feasible, or with the addition of (x̌k , ŷ k ) otherwise, the algorithm starts a new
iteration, using as a stopping criterion a maximum tolerance for the difference
between the best lower and upper bounds obtained. As shown in [7], assuming
that the KKT conditions are satisfied at the solutions of (Pŷk ) and (PŷFk ), the
strategy used to update L ensures that a given solution ŷ k for the integer variable
y is not visited more than once by the algorithm, except in case it is part of the
optimal solution of (P ) (in this case the solution may be visited at most twice).
As the number of integer solutions is finite by hypothes, since Y is bounded, the
algorithm is guaranteed to find an optimal solution of (P ) in a finite number of
iterations. Thus, in comparison with the ECP algorithm, OA tends to spend a
smaller number of iterations, with the overhead of needing to solve one or two
NLP problems at each iteration. Algorithm 2 presents the OA algorithm.

1 z l = −∞;
2 z u = +∞;
3 Choose an initial linearization point (x0 , y 0 ), (usually the optimal solution of
the continuous relaxation of (P ));
4 L = {(x0 , y 0 )};
5 k = 1;

6 Let M L be the master problem constructed from (P ) over the points in L;
7 Let (Pŷk ) be the NLP problem obtained by fixing the variable y of (P ) at ŷ k ;
8 Let (PŷFk ) be the feasibility NLP problem obtained from (Pŷk );
10 Let (α̂k , x̂k , ŷ k ) be an optimal solution of M L ;
11 z l = α̂k ;
12 if (Pŷk ) is feasible then
13 Let xk be an optimal solution of (Pŷk );
14 if f (xk , ŷ k ) < z u then
15 z u = f (xk , ŷ k );
16 (x∗ , y ∗ ) = (xk , ŷ k );
17 else
18 Let (uk , xk ) be an optimal solution of (PŷFk );
19 L = L ∪ {(xk , ŷ k )};
20 k = k + 1;
Algorithm 2. Outer Approximation (OA) algorithm.

4 Our Modified Extended Cutting Plane Algorithm

In this section, we present our approach based on the ECP algorithm, which
we call Modified Extended Cutting Plane (MECP). Our main motivation is to
improve the performance of ECP, turning it more similar to OA, while keeping
the nice characteristic of ECP of being a first order method. With this pur-
pose, instead of considering problems (Pŷk ) and (PŷFk ), MECP considers a linear
approximation for problem (Pŷk ), built over the same set of linearization points
L as problem (M L ). We denote then this new problem by (MŷLk ), as defined in
the following:

MŷLk min α
α,x
x − xj
s. t. ∇f (xj , y j )T k j + f (xj , y j ) ≤ α, ∀(xj , y j ) ∈ L
ŷ − yj (8)
x−x
∇g(xj , y j )T + g(x j
, y j
) ≤ 0, ∀(xj
, y j
) ∈ L
ŷ k − y j
x ∈ X.
Note that (MŷLk ) can be obtained from (M L ) simply by fixing the variable y at
the value ŷ k . Thus, when considering problem (MŷLk ), we expect to obtain good
feasible solutions sooner when comparing with the traditional ECP algorithm.
These solutions are used for a possible update of the known upper bound z u and
to strengthen the relaxation given by the master problem through their inclusion
in the set L.
The MECP algorithm is presented as Algorithm 3. Comparing to the ECP
algorithm, the novelty is the introduction of lines 7, 12–17. We point out that, at
each iteration, between the solution of (M L ) (line 9) and the solution of (MŷLk )
(lines 12–13), the solution (x̂k , ŷ k ) is added to set L (line 11), to strengthen the
linear relaxation built with the points of this set. For this reason, it is possible
that the optimal solution xk of (MŷLk ) is different from x̂k . With this strategy, we
expect that the MECP algorithm will find feasible solutions sooner, and, there-
fore, can close the integrality gap with less computational effort, when compared
to ECP. We still note that the solution obtained when solving (MŷLk ) is also added
to L (line 17), in case the problem is feasible. As (MŷLk ) is a linear programming
problem, its resolution does not add significant computational burden, when
compared to the resolution of (M L ).
It is important to point out that the convergence of MECP to an optimal
solution of (P ) is easily verified from the convergence of ECP, as MECP con-
siders, during its iterations, all linearization points considered by ECP (with
some additional points). We also emphasize that, in the context of this discus-
sion, the approximation of the feasibility problem (PŷFk ) by a linear programming
problem would make no sense, because in case (MŷLk ) is infeasible, the value ŷ k
cannot represent a solution to y in any problem (M L ) from the current iteration,
and, therefore, the resolution of a feasibility problem is not necessary to cut out
solutions with value ŷ k from (M L ).
434 W. Melo et al.

1 z l = −∞;
2 z u = +∞;
3 Choose an initial linearization point (x0 , y 0 );
4 L = {(x0 , y 0 )};
5 k = 1;
6 Let (M L ) be the master problem constructed from (P ) over the points in L;
7 Let (MŷLk ) be the problem obtained by fixing the variable y of (M L ) at ŷ k ;
9 Let (α̂k , x̂k , ŷ k ) be an optimal solution of (M L );
10 z l = α̂;
11 L = L ∪ (x̂k , ŷ k );
12 if (MŷLk ) is feasible then
13 Let (αk , xk ) be an optimal solution of (MŷLk );
14 if (xk , ŷ k ) is feasible for (P ) AND f (xk , ŷ k ) < z u then
15 (x∗ , y ∗ ) = (xk , ŷ k );
16 z u = f (xk , ŷ k );
17 L = L ∪ (xk , ŷ k ) ;
18 k = k + 1;
Algorithm 3. Modified Extended Cutting Plane (MECP) algorithm.
Finally, we note that (MŷLk ) is an attempt of approximating problem (Pŷk )

using linear programming. Thus, MECP is still a first order method, which can
be implemented with no routines to solve NLP problems. This fact can represent
a practical advantage, because, besides the reasons already mentioned about the
difficulty in the computation of second order derivatives in some applications, in
many cases, even the best computational NLP routines may fail to converge to
the optimal solution of the addressed problem, even when the problem is convex
and continuously differentiable. In some cases, these routines fail even to find
a feasible solution of problems with nonempty feasible set, incorrectly declaring
infeasibility.
Thus, because it does not depend on NLP routines, the MECP algorithm
appears as a robust approach for solving several problems of convex MINLP.
5 Computational Results
We now present computational results obtained with the application of ECP,

OA, and our approach MECP, on a set of 343 convex MINLP test instances from
the following libraries [3,9,15]. Table 1 brings statistics about the test problems.
The high percentage of linear constraints present in the instances, in general, is
apparent.
Table 1. Statistics on test problems.
Min Max Mean Median

Variables 2 107222 975.34 114
Integer variables (%) 0.00 1.00 0.52 0.40
Constraints 0 108217 1184.65 211
Linear constraints (%) 0.00 1.00 0.89 0.95
The algorithms were implemented in C++ 2011 and compiled with ICPC
16.0.0. To solve the MILP problems, we use the solver Cplex 12.6.0 [4], and
to solve the NLP problems, we use the solver Mosek 7.1.0 [1]. The tests were run
on a computer with processor core i7 4790 (3.6 GHz), under the operating system
Open Suse Linux 13.1. All the algorithms were configured to be executed by
a single processing thread, which means that they were executed by a single
processor at each time in the machine used for the tests. The CPU time of each
algorithm in each test instance was limited to 4 hours. Values of 10−6 and 10−3
were adopted as absolute and relative convergence tolerance, respectively, for all
algorithms.
Fig. 1. Relative comparison of CPU time for the algorithms.
Figure 1 brings a relative comparison between the algorithms with respect to

the CPU time spent in the set of test instances considered. Note that the data
are normalized with respect to the best result obtained from all the approaches
for each instance. In the horizontal axis, the abscissa indicates the number of
times that the result obtained by the algorithm was greater than the best result
among the algorithms with respect to computational time. In the vertical axis,
the ordinate indicates the percentage of instances reached by each approach.
436 W. Melo et al.
More specifically, if the curve of a given algorithm passes through the point
(α, τ ), this indicates that for τ % of the instances, the result obtained by the
algorithm observed is smaller or equal to α times the best computational time
among all algorithms.
Note, for example, that the OA curve passes through the point (1, 57%).
This means that, for 57% of the test instances considered, OA achieves the best
result with respect to the computational time (one time greater than the best
result). Next, the curve passes through the point (1.2, 63%), indicating that OA
was able to solve 63% of the instances spending up to 20% more time than the
best algorithm in each (1.2 times greater than the best result). Thus, roughly
speaking, we can say that, the more the curve of an algorithm is above the
curves of the other algorithms in the graph, the better the algorithm did when
compared to the others, with respect to the characteristic analyzed in the graph.
Analyzing Fig. 1, we can observe that the performance of OA dominates
the performance of ECP. It is also possible to note that our MECP algorithm
presents substantially better results than ECP, completely dominating its per-
formance and even becoming competitive in relation to the OA algorithm. It is
worth noting that the MECP curve dominates the OA curve for results greater
than or equal to 2.2 times the best result. All algorithms were able to solve about
90% of the test instances in the maximum running time stipulated.
Finally, we note that the implementations of all the algorithms considered in
this study, ECP, MECP and OA, together with heuristics [11] are available in
our MINLP solver Muriqui [10,12].
References
1. The MOSEK optimization software. Software. http://www.mosek.com/
2. Bonami, P., Kilinç, M., Linderoth, J.: Algorithms and software for convex mixed
integer nonlinear programs. Technical Report 1664, Computer Sciences Depart-
ment, University of Wisconsin-Madison (2009)
3. CMU-IBM: Open source MINLP project (2012). http://egon.cheme.cmu.edu/ibm/
page.htm
4. Corporation, I.: IBM ILOG CPLEX V12.6 User’s Manual for CPLEX (2015).
https://www.ibm.com/support/knowledgecenter/en/SSSA5P 12.6.0
5. D’Ambrosio, C., Lodi, A.: Mixed integer nonlinear programming tools: a practical
overview. 4OR 9(4), 329–349 (2011). https://doi.org/10.1007/s10288-011-0181-9
6. Duran, M., Grossmann, I.: An outer-approximation algorithm for a class of mixed-
integer nonlinear programs. Math. Program. 36, 307–339 (1986). https://doi.org/
10.1007/BF02592064
7. Fletcher, R., Leyffer, S.: Solving mixed integer nonlinear programs by outer
approximation. Math. Program. 66, 327–349 (1994). https://doi.org/10.1007/
BF01581153
8. Hemmecke, R., Köppe, M., Lee, J., Weismantel, R.: Nonlinear integer program-
ming. In: Jünger, M., Liebling, T.M., Naddef, D., Nemhauser, G.L., Pulleyblank,
W.R., Reinelt, G., Rinaldi, G., Wolsey, L.A. (eds.) 50 Years of Integer Programming
1958–2008, pp. 561–618. Springer, Heidelberg (2010). https://doi.org/10.1007/978-
3-540-68279-0 15
9. Leyffer, S.: Macminlp: Test problems for mixed integer nonlinear programming
(2003). https://wiki.mcs.anl.gov/leyffer/index.php/macminlp (2013). https://
wiki.mcs.anl.gov/leyffer/index.php/MacMINLP
10. Melo, W., Fampa, M., Raupp, F.: Integrating nonlinear branch-and-bound and
outer approximation for convex mixed integer nonlinear programming. J. Glob.
Optim. 60(2), 373–389 (2014). https://doi.org/10.1007/s10898-014-0217-8
11. Melo, W., Fampa, M., Raupp, F.: Integrality gap minimization heuristics for binary
mixed integer nonlinear programming. J. Glob. Optim. 71(3), 593–612 (2018).
https://doi.org/10.1007/s10898-018-0623-4
12. Melo, W., Fampa, M., Raupp, F.: An overview of MINLP algorithms and their
implementation in muriqui optimizer. Ann. Oper. Res. (2018). https://doi.org/10.
1007/s10479-018-2872-5
13. Trespalacios, F., Grossmann, I.E.: Review of mixed-integer nonlinear and general-
ized disjunctive programming methods. Chem. Ing. Tech. 86(7), 991–1012 (2014).
https://doi.org/10.1002/cite.201400037
14. Westerlund, T., Pettersson, F.: An extended cutting plane method for solving con-
vex MINLP problems. Comput. Chem. Eng. 19(Supplement 1), 131–136 (1995).
https://doi.org/10.1016/0098-1354(95)87027-X. European Symposium on Com-
puter Aided Process Engineering
15. World, G.: MINLP library 2 (2014). http://www.gamsworld.org/minlp/minlplib2/
html/
On Proximity for k-Regular
Mixed-Integer Linear Optimization
Luze Xu and Jon Lee(B)

{xuluze,jonxlee}@umich.edu
Abstract. Putting a finer structure on a constraint matrix than is

afforded by subdeterminant bounds, we give sharpened proximity results
for the setting of k-regular mixed-integer linear optimization.
Keywords: Mixed-integer linear optimization · Proximity · k-regular
1 Introduction
We study the standard form MILO (mixed-integer linear optimization) problem
min{c x : Ax = b; x ≥ 0; xi ∈ Z for all i ∈ I}, (I-MIP)
with full row-rank A ∈ Zm×n , b ∈ Qm , and I ⊆ [n] := {1, 2, . . . , n}. The main
issue that we are interested in is: for distinct I, J ⊆ [n], and optimal solution
x∗ (I) to I-MIP, find a good upper bound on x∗ (I) − x∗ (J )∞ for some optimal
x∗ (J ) to J -MIP. Mostly we are considering the ∞-norm, though it is nice to
have results using the 1-norm. A key special-case of interest is I = ∅ and J = [n],
where we are asking for a bound on how far components of an optimal solution
of a pure MILO problem may be from components of a solution of its continuous
relaxation—a quantity that is very relevant to the issue of rounding and local
search starting from a relaxation solution. In some situations we add further
natural conditions (e.g., b ∈ Zm , x∗ (∅) is a basic solution, etc.).
Even in dimension n = 2, it is easy to construct examples where the solution
of a pure MILO problem is far from the solution of its continuous relaxation.
Choose p1 < p2 to be a pair of large, relatively-prime positive integers. Consider
the integer standard-form problem
min{x1 : p2 x1 − p1 x2 = p1 − p2 ; x1 , x2 ≥ 0; xi ∈ Z for all i ∈ I}, (I-P)
By the equation in I-P, every feasible solution (x̂1 , x̂2 ) satisfies (x̂1 +1)/(x̂2 +1) =
p1 /p2 . Because p1 and p2 are relatively prime, there cannot be a feasible solution
to {1, 2}-P with x̂1 smaller than p1 − 1. So the optimal solution to {1, 2}-P is
(z1∗ , z2∗ ) := (p1 − 1, p2 − 1) . But it is very easy to see that the (unique and basic)
Supported in part by ONR grant N00014-17-1-2296.
https://doi.org/10.1007/978-3-030-21803-4_44
On Proximity for k-Regular Mixed-Integer Linear Optimization 439
optimal solution to its continuous relaxation ∅-P is (x∗1 , x∗2 ) := (0, −1 + p2 /p1 ),
quite far from the optimal solution to {1, 2}-P. With such a small example, it is
not obvious exactly what drives this behavior, and at a high level our goal is to
control and investigate this.
1.1 Literature Review and Outline

Many of the results in this area focus on the general-form MILO problem
max{c x : Ax ≤ b, xi ∈ Z for all i ∈ I},

where A ∈ Zm×n , b ∈ Qm , and ∅ = I ⊆ [n] := {1, 2, . . . , n} (see [2,4,5,9,13]).
[2] gives a bound nΔ(A) on the ∞-norm distance between optimal solutions
to the general-form pure MILO problem and its continuous relaxation, where
Δ(A) is the maximum of the absolute values of the determinants of the square
submatrices of A. Note that if b is not restricted to be integer, then this bound
nΔ(A) is best possible (see [11]). But if we assume that b ∈ Zm , then it is not
known whether this bound is optimal or not. [4] generalizes the objective function
from linear functions to convex separable quadratic functions; [5,13] generalizes
further to convex separable function. Recently, [9] considers proximity between
optimal solutions of general MILO problems that differ only in the sets of indices
of integer variables, and obtains a bound of |I1 ∪ I2 | Δ(A), where I1 and I2 are
the sets of the indices of the integer variables.
Of course we can move from the standard-form I-MIP to the general form
by recasting Ax = b, x ≥ 0 as
⎡ ⎤ ⎡ ⎤
A b
⎣−A⎦ x ≤ ⎣−b⎦ .
−I 0
Thus we could directly apply the theorems in the general form to the standard
form using the simple fact that Δ([A; −A; −I]) = Δ(A). However, the special
structure of the resulting general-form matrix could imply a better bound in some
cases. For example, making a clever argument employing the “Steinitz Lemma”,
[3] establishes x̄ − z ∗ 1 ≤ m(2mU + 1)m , when x̄ is a basic optimal solution of
∅-MIP and z ∗ is some optimal solution of [n]-MIP, where U = maxij {|aij |}. [1]
gives an optimal bound Δ(A)−1 on the ∞-norm distance between basic optimal
solutions and feasible integer solutions for standard-form knapsack polyhedra,
i.e., the case when m = 1. [6,7] establishes a bound of k dim(n.s.(A)) on the
∞-norm distance between optimal solutions for “k-regular” ∅-MIP and [n]-MIP,
where k-regular means that the elementary vectors (i.e., the nonzero vectors with
minimal support) in the null space of the constraint matrix A can be scaled to
have all entries in {0, ±1, ±2, . . . , ±k} (see [6–8]). Note that k = 1 or regular
is equivalent to A be equivalent to a totally-unimodular matrix. A nice family
of examples with k = 2 is when A is the vertex-edge incidence matrix (or its
transpose) of a mixed-graph.
440 L. Xu and J. Lee
In what follows, we focus on k-regular I-MIP. In Sect. 1.2, we review some

needed fundamentals. In Sect. 2.1, we establish a proximity result for k-regular
I-MIP. In Sect. 2.2, we improve the bound for a 2-regular pure MILO problem
relative to a basic optimal solution of its continuous relaxation. In Sect. 3, we
consider a special 2-regular case, where A is the vertex-edge incidence matrix of
a mixed graph G, and we give a sufficient condition on G such that the ∞-norm
distance between any basic optimal solution and some feasible integer solution
is at most 1.
1.2 Fundamentals
Let F be an arbitrary field. For any x ∈ Fn , the support of x, denoted x, is the
set of coordinates with nonzero entries, i.e., x := {i ∈ [n] : xi = 0}. Let V be
a vector subspace of Fn . A vector x ∈ V is an elementary vector of V if x = 0,
and x has minimal support in V \ {0}; i.e., x ∈ V \ {0} and y ∈ V \ {0} with
y x. The set of elementary vectors of V is denoted as F(V ).
Assume now that F is ordered. A vector y ∈ Fn conforms to x ∈ Fn if
xi yi > 0 for i ∈ y. The following result of Rockafellar is fundamental, that every
nonzero x ∈ V can be expressed as a conformal sum of at most min{dim(V ), |x|}
elementary vectors from F(V ).
Theorem 1. ([10], Theorem 1). Let V be a subspace of Fn , where F is an
t x ∈ V \ {0}, there exists elementary vectors v1 , . . . , vt ∈
ordered field. For every
V , such that x = i=1 vi , where each vi conforms to x, none has its support
contained in the union of the supports of the others, and t ≤ min{dim(V ), |x|}.
Definition 2. Let V be a subspace of Rn . The subspace V is k-regular if
∀ x ∈ F(V ), ∃ λ ∈ F \ {0} such that λxi ∈ {±1, ±2, . . . , ±k} ∀ i ∈ x.
We also refer to a standard-form problem as k-regular when the null space of
its constraint matrix is k-regular. We have the simple following property for
2-regular standard-form problems.
Proposition 3. Let P := {x : Ax = b, x ≥ 0}, where A ∈ Zm×n and
rank(A) = m, and suppose that P has an integer feasible solution (this implies
that b ∈ Zm ). If V := n.s.(A) is 2-regular, then every basic solution (feasible or
not) x̄ of P satisfies 2x̄ ∈ Zn .
Proof. Rearranging columns, we may assume that the first m columns of A form
a basis matrix Aβ corresponding to x̄. That is, A = [Aβ , Aη ], x̄β = A−1 β b, and
x̄η = 0. Multiply A−1 β on both sides of Ax = b, we get [I, M ]x = A −1
β b. Also
V = r.s.([−M , I]) (r.s.(B) means the row space of B), and each row of [−M , I]

is in F(V ). Because each row has an entry of 1, and each row can be scaled by
a nonzero to have all entries in {0, ±1, ±2}, it follows that all entries of M are
in {0, ± 12 , ±1, ±2}. So we have that Ax = b is equivalent to [2I, 2M ]x = 2A−1β b,
where now [2I, 2M ] is an all-integer matrix. Plugging a feasible integer solution
x0 into [2I, 2M ]x = 2A−1 −1
β b, we can conclude that 2x̄β = 2Aβ b ∈ Z , and so
m
2x̄ ∈ Z .
n

2 Proximity for k-Regular MILO

2.1 k-Regular Mixed-Integer Linear Optimization
[9] considers the question of bounding the ∞-norm distance between optimal
solutions of mixed-integer linear problems that only differ in the sets of indices
of integer variables. And by using the properties of so-called bimodular systems
in [12], they manage to give a tighter bound Δ(A) − 1 for the special case when
Δ(A) ≤ 2 and I1 , I2 ∈ {∅, [n]}, which is just in terms of Δ(A) and not relative to
|I1 ∪ I2 | = n. However, Δ(A) ≤ 2 is a very strong assumption. Of course totally-
unimodular A have this property, so we have vertex-edge incidence matrices of
digraphs, for example. But we do not know further broad families of examples
for Δ(A) ≤ 2. It is natural to think about vertex-edge incidence matrices of
mixed graphs. But in general these have subdeterminants that are ±2k , k ∈ Z+ .
For example, if G is a collection of k disjoint undirected triangles, then the
square vertex-edge incidence matrix has determinant 2k . But interestingly, the
null space of the vertex-edge incidence matrix of every mixed graph is 2-regular
(see [7], for example), so there is an opportunity to get a better proximity bound
than afforded by only considering Δ(A). Generally, for integer matrices A, we
have k ≤ Δ(A) (see [7]), and so the idea of k-regularity gives a more refined
view that can be exploited.
We consider the standard-form I-MIP, where we assume that V := n.s.(A)
is k-regular and dim(V ) = r. Note that [n]-MIP is the k-regular pure-integer
problem, while ∅-MIP is its continuous relaxation.
Theorem 4. ([7]). If [n]-MIP is feasible, then for each optimal solution x∗ to
the corresponding continuous relaxation ∅-MIP, there exists an optimal solution
z ∗ to [n]-MIP with z ∗ − x∗ ∞ ≤ kr.
We are going to generalize Theorem 4, using the technique developed in [9].
Toward this, we restate the main lemma used in [9].
Lemma 5. ([9], Lemma 1). Let d, t ∈ Z≥1 , g 1 , . . . , g t ∈ Zd , and α1 , . . . , αt ≥
t
0. If i=1 αi ≥ d, then there exist βi ∈ [0, αi ] for i ∈ [t] such that not all
t
β1 , . . . , βt are zero and i=1 βi g i ∈ Zd .
We use a mild generalization of [3, Lemma 5] (from the case I = [n], J = ∅):
Lemma 6. Assume that I ∪ J = [d], where d ∈ [n]. Let x∗ (I) and x∗ (J )

be an optimal solution of I-MIP and J -MIP, respectively. If there is a vector
w ∈ Zd × Rn−d satisfying Aw = 0, w conforms to x∗ (I) − x∗ (J ), and |wi | ≤
|x∗ (I)i − x∗ (J )i | for i ∈ [n], then x∗ (I) − w and x∗ (J ) + w are also an optimal
solution to I-MIP and J -MIP, respectively.
Proof. First we claim that x∗ (J ) + w is also feasible to J -MIP. To see this,

because the coordinates of x∗ (J ) and w indexed by J are integer, we observe
that the coordinates of x∗ (J ) + w indexed by J are also integer. Furthermore,
because Aw = 0, we see that A(x∗ (J ) + w) = b. Also because w conforms to
x∗ (I) − x∗ (J ) and |wi | ≤ |x∗ (I)i − x∗ (J )i |, we have x∗ (J ) + w ≥ 0. Similarly,

x∗ (I) − w is also feasible to I-MIP. Because of the optimality of x∗ (I), we
have c x∗ (I) ≤ c (x∗ (I) − w), i.e., c w ≤ 0, thus c (x∗ (J ) + w) ≤ c x∗ (J ).
Therefore x∗ (J ) + w is also an optimal solution of J -MIP and c w = 0. This
also implies that x∗ (I) − w is also an optimal solution of I-MIP.
Next, we generalize Theorem 4, which is the special case when I = ∅, J = [n].
Also this theorem gives a better bound than [9] for the k-regular case (because
k ≤ Δ(A)). The proof is mainly based on Theorem 1, Lemmas 5 and 6.
Theorem 7. Suppose that V := n.s.(A) is k-regular and dim(V ) = r. Let
I, J ⊆ [n] with I = J such that J -MIP has an optimal solution. For every
optimal x∗ (I) of I-MIP, there exists an optimal x∗ (J ) of J -MIP such that
x∗ (I) − x∗ (J )∞ ≤ k min{r, |I ∪ J |}.
Proof. Without loss of generality, assume that I ∪ J = [d], with d ∈ [n]. Let
x∗ (I) ∈ Rn be optimal for I-MIP. And let z̃(J ) ∈ Rn be any optimal solution of
J -MIP. By Theorem 1, y := x∗ (I) − z̃(J ) ∈ V can be expressed as a conformal
t
sum of at most r vectors in F(V ), i.e., y = i=1 v i , t ≤ r, where each v i ∈ F(V )
i
conforms to y. For each summand v , because V is k-regular, there exists a
positive scalar λi , so that that λ1i v i is {0, ±1, . . . , ±k}-valued. So we have
t t
y= i=1 λi λ1i v i := i=1 λi g i ,

where g i = λ1i v i is an integer vector with g i ∞ ≤ k and Ag i = 0, g i = λ1i v i
also conforms to y. Next, consider the set
t
S := {(γ̄1 , . . . , γ̄t ) : γ̄i ∈ [0, λi ] for all i ∈ [t], i=1 γ̄i g ∈ Z × R
i d n−d
},
which is non-empty (it contains t 0) and compact. Hence,there exists some

t
(γ1 , . . . , γt ) ∈ S maximizing i=1 γ̄i over S. Let w :=
i
i=1 γi g . Because
∗
gi
conforms i to y = x
t (I) − z̃(J
i ), we know that w also conforms to y and
t
γ g = |w |, λ g = |y | for j ∈ [n]. Thus |w | ≤ |y | for j ∈ [n]
i=1 i j j i=1 i j j j j
t
because γi ∈ [0, λi ]. Along with Aw = i=1 γi Ag = 0, by Lemma 6, we know
i
that x∗ (J ) := z̃(J ) + w is also an optimal solution to (J -MIP). The distance

from x∗ (J ) to x∗ (I) can be bounded as follows:

t t
t
i
∗ ∗
x (I) − x (J )∞ = (λi − γi )g ≤ (λi − γi ) g i ∞ ≤ (λi − γi )k.

i=1 ∞ i=1 i=1
t
It remains to argue that i=1 (λi − γi ) ≤ min{d, r}. Since ( λ1 , . . . , λt ) ∈ S,
t t
we have i=1 γi ≥ i=1 λi , thus
t t
i=1 (λi − γi ) ≤ i=1 (λi − λi ) ≤ t ≤ r.
t
Now we only need to argue that i=1 (λi − γi ) ≤ d. Let αi := λi − γi ≥ 0
for i ∈ [t]. Suppose, for the sake of contradiction, that this inequality does not
t
i=1 αi > d. Thus, let h ∈ Z be the projection of g onto the
i d i
hold, i.e.,
i
first d coordinates, we can apply Lemma 5 to α
t i , h , and obtain β ,
1 . . . , βt with
t
βi ∈ [0, αi ] such that not all βi ’s are zero and i=1 βi hi ∈ Zd . Hence i=1 βi g i ∈

Z ×R
d n−d
. Now we consider γi := γi + βi ≥ 0. Note that γi ≤ γi + αi = λi , and
t i
t t
i=1 βi g ∈ Z × R
i i d n−d
i=1 γi g = i=1 γi g + .
t
So (γ1 , . . . , γt ) ∈ S. However, because not all βi ’s are zero, we have γi >
t i=1
i=1 γi , which contradicts the maximality of (γ1 , . . . , γt ).
2.2 2-Regular Pure-Integer Linear Optimization

To improve the bound of Theorem 7, we focus on the important special case of
k = 2 and b ∈ Zm , for two important subcases: (i) I = ∅, J = [n] when the
optimal solution of ∅-MIP is basic, and (ii) I = [n], J = ∅. In this case, the
bound of Theorem 7 is 2r, which we now improve to 3r/2.
Theorem 8. Suppose that A has full row rank, V := n.s.(A) is 2-regular, b ∈

Zm , [n]-MIP is feasible, and ∅-MIP has an optimal solution.
(1) For each basic optimal solution x̄∗ to ∅-MIP, there exists an optimal solution
z ∗ to [n]-MIP with x̄∗ − z ∗ ∞ ≤ 32 r;
(2) For each optimal solution z ∗ to [n]-MIP, there exists an optimal solution x∗
to ∅-MIP with z ∗ − x∗ ∞ ≤ 32 r.
Proof. (1): The proof is similar to that of Theorem 7 with some extra care using
Theorem 1 and Proposition 3. Let x̄∗ be a basic optimal solution of ∅-MIP. If
x̄∗ ∈ Zn , then z ∗ := x̄∗ satisfies the conclusion, so we just consider x̄∗ ∈ / Zn .
∗
Because V is 2-regular, by Proposition 3, we have 2x̄ ∈ Z . Let z̃ be optimal for
n
[n]-MIP. By Theorem 1, y := x̄∗ − z̃ ∈ Vcan be expressed as a conformal sum

t
of at most r vectors in F(V ), i.e., y = j=1 v j , t ≤ r, and none of v j has its
support contained in the union of the supports of the others, which means that
for each v j , there exists an index ij ∈ {1, . . . , n} such that vijj = 0 and vilj = 0
for l = j.
For each summand v j , because V is 2-regular, there exists a positive scalar
λj for v j , so that λ1j v j is {0, ±1, ±2}-valued. So we can write
t t
y= j=1 λj λ1j v j := j=1 λj g j ,
j
λj v is an integer vector with gi ∈ {0, ±1, ±2}.
1 j
where g j =
t
Let w := j=1 λj g j ∈ Zn . By Lemma 6, we know that z ∗ := z̃ + w is also
an optimal solution to [n]-MIP. For this optimal solution, we have
t
x̄∗ − z ∗ = j=1 (λj − λj )g j .
Without loss of generality, we can assume that μj = λj − λj ∈ (0, 1).
t t
Because 2x̄∗ ∈ Zn , we have j=1 2μj g j ∈ Zn . For s ∈ {1, . . . , t}, j=1 2μj gijs =

2μj gijs ∈ Z \ {0}, when gijs = 1, μj = 12 , when gijs = 2, μj ∈ { 14 , 12 , 34 }.
Therefore,

t
t
3 3
x̄∗ − z ∗ ∞ =
(λ j − λ j ) g j
≤ (λj − λj ) g j ∞ ≤ · 2t ≤ r.
j=1 j=1
4 2
∞
(2): The proof is similar to that of (1) by choosing a basic optimal solution x̄
first, and then letting x∗ := x̄ − w.
Remark 9. For the mixed-integer case, we do not have a result like Proposition 3
for the optimal solution; so we cannot generalize Theorem 8 in such a direction.
Next we give an example to demonstrate that for Theorem 8, Part (1), the
bound of 32 r cannot be improved to better than r.
⎡ ⎤ ⎡ 1 1 ⎤
2 −2
1
101 2
⎢ ⎥
Example 10. Let G = ⎣1 1 0⎦, G−1 = ⎣− 12 12 12 ⎦, e1 = (1, 0, 0) , h =
011 1
− 12 12
⎡ ⎤ 2 ⎡ ⎤
1000 1 0 0 0
⎢1 1 0 1⎥ ⎢− 1 1 1 − 1 ⎥
G−1 e1 = ( 12 , − 12 , 12 ) , Ḡ = ⎢ ⎥, Ḡ−1 = ⎢ ⎢ 1
2 2 2 2⎥
1 ⎥, ē1 =
⎣0 1 1 0⎦ ⎣ 2 − 1 1
2 2 2 ⎦
0011 − 12 12 − 21 12
(1, 0, 0, 0) , h̄ = Ḡ−1 ē1 = (1, − 12 , 12 , − 12 ) ,
⎡ ⎤
Ḡ 0 . . . 0 ē1 . . . ē1
⎢ 0 G . . . 0 e1 . . . 0 ⎥
⎢ ⎥
A=⎢. . . . ⎥ ∈ Z(4+3p)×(4+4p) , b = β1 ∈ Z4+3p ,
⎣ . . . . 0 ... 0⎦
. . . .
0 0 . . . G 0 . . . e1
Let B = diag{Ḡ, G, . . . , G}, and consider the basic feasible solution u = B −1 b =

(β, 0, β, 0, β2 1 ) . For each integer solution x in P ∩ Z4+4p , left multiply Ax = b
by B −1 to get ⎡ ⎤
I4 0 . . . 0 h̄ . . . h̄
⎢ 0 I3 . . . 0 h . . . 0 ⎥
⎢ ⎥
⎢ .. .. . . .. ⎥ x = u,
⎣ . . . . 0 . . . 0⎦
0 0 . . . I3 0 . . . h
i.e., ⎡ ⎤ ⎡ ⎤ ⎡ ⎤
x1 β 1
⎢x2 ⎥ ⎢ 0 ⎥ p
⎢− 1 ⎥
⎢ ⎥=⎢ ⎥− x3p+4+i ⎣ 12 ⎥
⎢
⎦∈Z
4
⎣x3 ⎦ ⎣β ⎦
i=1 2
x4 0 − 21
⎡ ⎤ ⎡ 1 ⎤
x3i+2
⎣x3i+3 ⎦ = β 1 − x3p+4+i ⎣− 1 ⎦ ∈ Z3 , i = 1, . . . , p
2
2 2
1
x3i+4 2
Now, let p be an even integer, and let β be a large enough (> p) odd integer.
Then x3p+4+i is odd for i = 1, . . . , p because of the integrality
pof x3i+2 , which
implies x3p+4+i ≥ 1. In this case, x − u∞ ≥ |x1 − u1 | = i=1 x3p+4+i ≥ p
for all feasible integer solution x. Note that because A has full row rank, r =
n − m = (4 + 4p) − (4 + 3p) = p.
3 Special Case: The Incidence Matrix of a Mixed Graph
Inspired by the proximity result for bimodular matrices (i.e., Δ(A) ≤ 2) in [9,12],
we consider a special case where A is the incidence matrix of a mixed graph.
Such an A has a 2-regular null space, but it is not generally bimodular (see [7]).
A mixed graph G = G(V, E+ , E− , A) has vertices V, positive edges E+ , negative
edges E− , and arcs A. An edge with identical endpoints is a (positive or negative)
loop. An arc may have one void endpoint, in which case it is a half arc. The
incidence matrix A of G has a row for each vertex and a column for each edge
and arc. For each positive (resp., negative) loop e = (v, v), Av,e = +2(−2). For
all other positive (resp., negative) edges e = (v, w), Av,e = Aw,e = +1(−1). For
each half arc a = (v, ∅) (respectively, a = (∅, w)), Av,a = +1 (Aw,a = −1). For
each arc a = (v, w), Av,a = −Aw,a = +1. All unspecified entries of A are zero.
Mixed graphs, and their incidence matrices, have been studied previously under
the names bidirected graphs and signed graphs (see [14]).
For a mixed graph G(V, E+ , E− , A), we can construct an oriented signed graph
which has the same incidence matrix. The signed graph Σ consists of an unsigned
graph (V, {E+ , E− , A}), and an arc labelling σ, which maps E+ and E− to −1 and
maps A (except half arcs) to +1. Because there are in general two possibilities
for each column of the incidence matrix of a signed graph, the orientation is
chosen to make the incidence matrix the same as the mixed graph (see [14]).
For a cycle C = e1 e2 . . . ek not containing a half arc, if the product of the arc
labelling of the cycle σ(e1 )σ(e2 ) . . . σ(ek ) is +1, then the cycle is balanced, and
otherwise it is unbalanced. For C with some orientation, the incidence matrix is
⎡ ⎤
1 0 0 ... −σ(ek )
⎢−σ(e1 ) 1 0 ... 0 ⎥
⎢ ⎥
⎢ . .. ⎥
C=⎢ 0⎢ −σ(e2 ) 1 . . . ⎥ ⎥, (1)
⎢ . . . . . ⎥
⎣ .. .. .. .. .. ⎦
0 0 . . . −σ(ek−1 ) 1
up to permutations of rows and columns. Clearly, det C = 1 − σ(e1 ) . . . σ(ek ).

For a balanced cycle, det C = 0, and the incidence matrix is not of full rank. For
an unbalanced cycle, det C = ±2, and the incidence matrix has full rank. The
following is easy to see via Cramer’s rule.
Proposition 11. Suppose that C is the incidence matrix of an unbalanced cycle,
then every element in C −1 is ± 12 .
Lemma 12. If full row-rank A is the incidence matrix of a mixed graph, and B
is a basis of A, then up to row/column rearrangement, B is block diagonal, with
each block being the incidence matrix of a quasitree, consisting of a spanning
tree, plus a half arc or an arc forming an unbalanced cycle (including the case
that the arc is a loop).
Lemma 12 follows from Theorem 5.1 (g) of [14], but [14] has another case for
a block of B—that it represents a spanning tree. But, in that case det(B) = 0.
Theorem 13. Let A be the incidence matrix of a mixed graph G such that for
every set S of vertex disjoint unbalanced cycles in G, there is a partition S1 S2 of
S such that each unbalanced cycle in S1 has a half arc in G incident to the cycle
and S2 has a perfect matching in G pairing these unbalanced cycles. Suppose that
P := {x : Ax = b, x ≥ 0} and P ∩ Zn = ∅. Then for each vertex u of P , there
exists y ∈ P ∩ Zn satisfying y − u∞ ≤ 1.
Proof. If u ∈ Zn , then y = u ∈ P ∩Zn and the theorem holds. So we assume that

u∈/ Zn and B is a basis of A that satisfies BuB = b, where u = [uB ; uN ] and
uN = 0. By Lemma 12, with some rearrangement of the columns, we can assume
that B is a block diagonal matrix, with blocks B1 , . . . , Bs , where Bi represents
a quasitree containing an unbalanced cycle, i.e.,

Ci Ei
Bi = ∈ {0, ±1}ni ×ni ,
0 Di
where Ci ∈ {0, ±1}pi ×pi represents an unbalanced cycle, or Bi represents a

quasitree containing a half arc, i.e., |det Bi | = 1. Note that we also use Bi to call
the column indices of each block. We have
⎡ −1 ⎤ ⎡ ⎤
B1 b1 u1
⎢B −1 b2 ⎥ ⎢u2 ⎥
⎢ 2 ⎥ ⎢ ⎥
uB = B −1 b = ⎢ . ⎥ = ⎢ . ⎥ .
⎣ .. ⎦ ⎣ .. ⎦
Bs−1 bs us
If ui ∈
/ Zni , then |det Bi | = 2, and there is an unbalanced cycle Ci in this block.
Similarly to the proof of Theorem 2 in [12], the lattice L generated by the columns
of Bi−1 can be divided into two classes: Zni and ui +Zni . For any j ∈ [ni ], we have
rj = Bi−1 ej ∈/ Zni , otherwise 1 Bi rj = 1 ej , the left-hand side is even because

1 Bi ≡ 0 mod 2, while the right-hand side is odd, resulting in a contradiction.
Therefore rj ∈ ui + Zni . Also for j ∈ [pi ], rj = Bi−1 ej = [Ci−1 (:, j); 0], implying
that the first pi entries of rj are 12 or − 12 by Proposition 11.
Consider the set S of unbalanced cycles in the blocks that ui is not integer,
by the assumption, there is a partition S1 S2 of S such that each unbalanced
cycle in S1 has a half arc in G incident to the cycle and S2 has a perfect matching
in G pairing these unbalanced cycles.
For each unbalanced cycle in S1 , assume that the cycle is in block Bi , and
the half arc incident to it has a non-zero (±1) corresponding to the columns
with ej (j ∈ pi ) in block Bi . The corresponding column of the half arc in A is

denoted A·t . Let rj = −Bi−1 ej . We have ui ± rj ∈ Zni . Because j ∈ [pi ], we
have rj ∞ ≤ 12 . Now choose r as rt = 1, rB = −B −1 A·t and 0 otherwise for
this cycle. We have Ar = 0, rB ∞ ≤ 12 , and uB + rB is integer in block Bi .
For each pair of unbalanced cycles in S2 , assume that the cycles are in blocks
Bi1 and Bi2 , and the edge pairing them has two non-zeros corresponding to
the entries with ej1 (j1 ∈ [pi1 ]) in block Bi1 and ej2 (j2 ∈ [pi2 ]) in block Bi2 .
The column of the pairing edge in A is denoted A·t . Then let rj1 = −Bi−1 1
ej1
−1
and rj2 = −Bi2 ej2 . We have ui1 ± rj1 ∈ Z ni1
and ui2 ± rj2 ∈ Z . Also,
ni2
because j1 ∈ [pi1 ] and j2 ∈ [pi2 ], we have rj1 ∞ ≤ 12 , rj2 ∞ ≤ 12 . Now choose

r as rt = 1, rB = −B −1 A·t and 0 otherwise for this pair. We have Ar = 0,
rB ∞ ≤ 12 , and uB + rB is integer in blocks Bi1 and Bi2 .
Following this way, we can construct r1 , . . . , rl for all unbalanced cycles in
S1 and all pairs in S2 , and for each block that ui is not integer, there is only
one ri that is non-zero in this block. Let y = u + r1 + · · · + rl , then y ∈ Zn ,
and 0 ≤ (y − u)N ≤ 1, (y − u)B ≥ − 21 . Because u ≥ 0, we have yN ≥ uN ≥ 0,
yB ≥ uB − 12 ≥ − 12 , which implies yB ≥ 0 because of the integrality of y.
Therefore y ∈ P ∩ Zn and y satisfies y − u∞ ≤ 1.
References
1. Aliev, I., Henk, M., Oertel, T.: Distances to lattice points in knapsack polyhedra.
arXiv preprint arXiv:1805.04592 (2018)
2. Cook, W., Gerards, A.M., Schrijver, A., Tardos, É.: Sensitivity theorems in integer
linear programming. Math. Program. 34(3), 251–264 (1986)
3. Eisenbrand, F., Weismantel, R.: Proximity results and faster algorithms for integer
programming using the Steinitz lemma. In: SODA. pp. 808–816 (2018)
4. Granot, F., Skorin-Kapov, J.: Some proximity and sensitivity results in quadratic
integer programming. Math. Program. 47(1–3), 259–268 (1990)
5. Hochbaum, D.S., Shanthikumar, J.G.: Convex separable optimization is not much
harder than linear optimization. J. ACM 37(4), 843–862 (1990)
6. Lee, J.: Subspaces with well-scaled frames. Ph.D. dissertation, Cornell University
(1986)
7. Lee, J.: Subspaces with well-scaled frames. Linear Algebra Appl. 114, 21–56 (1989)
8. Lee, J.: The incidence structure of subspaces with well-scaled frames. J. Comb.
Theory Ser. B 50(2), 265–287 (1990)
9. Paat, J., Weismantel, R., Weltge, S.: Distances between optimal solutions of
mixed-integer programs. Math. Program. https://doi.org/10.1007/s10107-018-
1323-z (2018)
10. Rockafellar, R.T.: The elementary vectors of a subspace of Rn . In: Combinatorial
Mathematics and Its Applications, pp. 104–127. University of North Carolina Press
(1969)
11. Schrijver, A.: Theory of Linear and Integer Programming. Wiley (1998)
12. Veselov, S.I., Chirkov, A.J.: Integer program with bimodular matrix. Discret.
Optim. 6(2), 220–222 (2009)
13. Werman, M., Magagnosc, D.: The relationship between integer and real solutions
of constrained convex programming. Math. Prog. 51(1), 133–135 (1991)
14. Zaslavsky, T.: Signed graphs. Discrete Appl. Math. 4(1), 47–74 (1982)
On Solving Nonconvex MINLP Problems
with SHOT
Andreas Lundell1(B) and Jan Kronqvist2

1
Faculty of Science and Engineering, Mathematics and Statistics,
Åbo Akademi University, Turku, Finland
andreas.lundell@abo.fi
2
Department of Computing, Imperial College London, London, UK
j.kronqvist@imperial.ac.uk
Abstract. The Supporting Hyperplane Optimization Toolkit (SHOT)

solver was originally developed for solving convex MINLP problems, for
which it has proven to be very efficient. In this paper, we describe some
techniques and strategies implemented in SHOT for improving its per-
formance on nonconvex problems. These include utilizing an objective
cut to force an update of the best known solution and strategies for
handling infeasibilities resulting from supporting hyperplanes and cut-
ting planes generated from nonconvex constraint functions. For convex
problems, SHOT gives a guarantee to find the global optimality, but for
general nonconvex problems it will only be a heuristic. However, uti-
lizing some automated transformations it is actually possible in some
cases to reformulate all nonconvexities into linear form, ensuring that
the obtained solution is globally optimal. Finally, SHOT is compared to
other MINLP solvers on a few nontrivial test problems to illustrate its
performance.
Keywords: Nonconvex MINLP · Supporting Hyperplane

Optimization Toolkit (SHOT) · Reformulation techniques · Feasibility
relaxation
1 Introduction
Mixed-integer nonlinear programming (MINLP) constitutes a difficult class of
mathematical optimization problems. As MINLP combines the combinatoric
nature of mixed-integer linear programming (MILP) and nonlinearities of non-
linear programming (NLP), there is still today often a practical limit on the
size of the problems (with respect to number of constraints and/or variables)
that can be solved. While this limit is constantly pushed forward through the
means of computational and algorithmic improvement, there are still MINLP
problems with only a few variables that are difficult to solve. Most of these cases
AL and JK acknowledge support from the Magnus Ehrnrooth Foundation and the
Newton International Fellowship by the Royal Society (NIF\R1\82194) respectively.
https://doi.org/10.1007/978-3-030-21803-4_45
On Solving Nonconvex MINLP Problems with SHOT 449
are nonconvex problems, i.e., MINLP problems with either a nonconvex objec-
tive function or one or more nonconvex constraints, e.g., a nonlinear equality
constraint.
Globally solving convex MINLP problems can nowadays be regarded almost
as a technology, as seen in a recent benchmark [10]. However, global nonconvex
MINLP is still very challenging. Solvers for this problem class include Antigone
[18], BARON [21], Couenne [1] and SCIP [5]. These solvers mostly rely on spatial
branch and bound, where convex understimators and concave overestimators are
refined in nodes in a branching tree. There are also reformulation techniques that
can transform special cases of nonconvex problems, e.g., signomial [14] or general
twice-differentiable [16], into convex MINLP problems that can then be solved
with convex solvers.
The requirement to solve the MINLP problem to guaranteed optimality or
having tight bounds on the best possible solution may be a nice bonus, but for
many real-world cases, it is not always a necessity or even possible. Often end
users of optimization software are mostly interested in finding a good-enough
feasible solution to the optimization problem at hand within a reasonable time.
For these use-cases, local MINLP solvers may be worth considering. Here, a local
solver for MINLP is defined as a solver that while it does solve convex problems
to global optimality, it cannot guarantee that a solution is found for a nonconvex
problem, let alone a locally or globally optimal one. However, local solvers are
often faster than global solvers, and in many cases they also manage to return
the global solution, or a very good approximation of it. Local MINLP solvers
include AlphaECP [13], Bonmin [2], DICOPT [6] and SBB [4]. SHOT is a new
local solver initially intended mainly for convex MINLP problems [15].
In this paper, the following general type of MINLP problem is considered:
minimize f (x),
subject to Ax ≤ a, Bx = b,
gk (x) ≤ 0 ∀k ∈ KI ,
(1)
hk (x) = 0 ∀k ∈ KE ,
xi ≤ xi ≤ xi ∀i ∈ I = {1, 2, . . . , n},
xi ∈ R, xj ∈ Z ∀i, j ∈ I, i = j.
Here, the nonlinear functions f , g and h are considered to be differentiable,

but we set no restriction on the convexity of the functions. In this paper, as
is often the case in MINLP, we refer to solutions that fulfill all the constraints
in problem (1) as primal solutions, and the objective function value of the best
known primal solution, the lowest value in case of a minimization problem, as
the primal bound. A dual solution will then be a solution to a relaxed version
of problem (1), e.g., where the integer or nonlinear constraints are ignored. The
dual bound will then correspond to the best possible value the objective function
can take and the goal in global optimization is to reduce the gap between the dual
and primal bound to zero. Most deterministic global MINLP solvers will return
both a primal and a dual bound, while a local solver normally only provides a
primal bound.
450 A. Lundell and J. Kronqvist
2 The SHOT Solver

SHOT is an open source solver,1 combining a primal and a dual strategy as
described in detail in [15]. The dual strategy is based on the extended support-
ing hyperplane (ESH) [11] and extended cutting plane (ECP) [20] algorithms.
These iteratively improve the polyhedral outer approximation, in the form of a
MILP problem of the nonlinear feasible set in the MINLP problem by adding
linear cuts in the form of supporting hyperplanes or cutting planes. The dual
strategy is tightly integrated with the underlying MILP solver (CPLEX, Gurobi
or Cbc). The primal strategy in SHOT includes several (deterministic) heuristics
such as solving NLP subproblems with fixed-integer values or utilizing alterna-
tive solutions in the MILP solver’s solution pool, to find solutions fulfilling all
constraints in the MINLP problem. Including other primal strategies such as the
center-cut algorithm [9] is also planned.
Example 1. In Fig. 1, it is illustrated how the original nonconvex strategy in

SHOT works on a simple MINLP problem of two variables (2 ≤ x1 ≤ 8 and x2 ∈
{0, 1, 2}). The objective is to minimize f (x1 , x2 ) = x1 − x2 . There are two linear
constraints l1 (x1 , x2 ) ≤ 0 and l2 (x1 , x2 ) ≤ 0. The problem also has a nonconvex
feasible set resulting from the intersection of two nonlinear constraints.
Figure 1 illustrates that SHOT is not well-equipped for solving nonconvex

problems, and to prepare SHOT for solving these, some additional strategies
and techniques explained in the following sections are required. These include:
a heuristic for repairing infeasibilities that may appear when generating cuts
for nonconvex constraints and a strategy for forcing solution updates in case of
convergence to a local optima. In addition, some reformulations implemented in
SHOT, which exactly linearize certain classes of functions, are detailed in Sect. 4.
These reformulations have the advantage that if all nonconvex nonlinearities can
be handled in this way, the solution returned by SHOT will actually be the
guaranteed global solution.
3 SHOT as a Local Solver for Nonconvex MINLP

Problems
As could be seen in Example 1, the main issue is that cuts generated exclude
viable solution candidates. In general, a simple solution to this shortcoming is to
generate fewer and less tight hyperplane cuts, e.g., by generating cutting planes
(ECP) instead of supporting hyperplanes (ESH), or by reducing the emphasis
given to generating cuts for nonconvex constraints. Also, since it is nontrivial,
or in many cases not even possible, to obtain the integer-relaxed interior point
required in the ESH algorithm, the ECP method might in general be a safer
choice in the dual strategy in SHOT; this is however not considered in this
paper.
1
SHOT is available at https://www.github.com/coin-or/shot.
2 2 2
x2 x2 x2
1 1 1
l1 l2 c1
0 0 0
2 4 6 x1 8 2 4 6 x1 8 2 4 6 x1 8
Fig. 1. In the figures the shaded area indicate the integer-relaxed feasible region of the
MINLP problem. In the figure to the left, the MILP problem in the first iteration, with
the feasible set defined by the variable bounds and the two original linear constraints
l1 and l2 , has been solved to obtain the solution point (2, 2). In the middle figure, a
root search is performed according to the ESH algorithm between this point and an
interior point (6.0, 0.4) of the integer-relaxed feasible region of the MINLP problem
to give a point on the boundary. A supporting hyperplane (c1 ) is then added to the
MILP problem. In the figure to the right, the integer-relaxed feasible region of the
updated MILP problem is shown. Since all integer-feasible solutions have been cut off,
the problem is infeasible. We cannot continue after this and no primal solution is found.
MINLP problems containing nonlinear equality constraints also pose a prob-

lem; for example, they have no interior points. Currently the strategy is to replace
constraints h(x) = 0 with (h(x))2 ≤ 0, which are better suited for the dual strat-
egy. Other reformulations are also utilized, such as partitioning terms in sepa-
rable objective or constraint functions. This is explained in [12] for the convex
case, but the principle is the same also for nonconvex functions. Only the dual
problem is reformulated, the original problem formulation is still used in the
primal strategy. The original MINLP problem is also used for checking feasibil-
ity of solution candidates, since utilizing reformulated constraints could reduce
accuracy.
A simple technique to improve the performance of SHOT for nonconvex prob-
lems is to increase the effort put on the primal strategy. Currently, this can be
accomplished by (i) solving more integer-fixed NLP problems and (ii) increas-
ing the maximum number of solutions saved in the MILP solver’s solution pool
and changing the strategy of the MILP solver to put more emphasis on finding
alternative solutions. An addition is also that after an integer combination has
been tested with the fixed NLP strategy, an integer cut is introduced similarly
to what is done in DICOPT. Note however, that in SHOT, this is currently not
possible for problems with nonbinary integer variables.
3.1 Repairing Infeasibilities in the Dual Problems

For nonconvex problems, cuts added for nonconvex constraints normally tend
to make the dual problem infeasible as more and more cuts are added. At this
stage, if a primal solution has been found we can terminate with this solution.
One alternative strategy would also be to remove some of the cuts added until
the problem is feasible again, but this can eliminate the effect of cuts added
in previous iterations and result in cycling. SHOT uses a different approach,
however, where the MILP problem is relaxed to restore feasibility. The same
approach was successfully used in [8], where an implementation of the ECP
algorithm in Matlab was connected to the process simulation tool Aspen to
solve simulation based MINLP problems. A similar technique was also used in
[7] to determine the feasibility of problems. To find the relaxation needed to
restore feasibility, the following MILP problem is solved
minimize vT r
subject to Ax ≤ a, Bx = b,
Ck x + ck ≤ r, r ≥ 0, (2)
xi ≤ xi ≤ xi ∀i ∈ I = {1, 2, . . . , n},
xi ∈ R, xj ∈ Z ∀i, j ∈ I, i = j.
Here the matrix Ck and vector ck contains the cuts added until the current
iteration k. The vector r contains the relaxations for restoring feasibility and
the vector v is used to individually penalize the relaxations of the different cuts.
The main strategy is to penalize the relaxation of the last cuts stronger than
the relaxation of the cuts from the early iterations. This will favor a relaxation
of the early cuts and reduce the risk of cycling, i.e., first adding the cuts in
one iteration and removing them in the next. The penalty terms in SHOT are
currently determined as vT = [1, 2, . . . N ], where N is total number of cuts
added. After the relaxation problem (2) is solved, the MILP model is modified
as Ck x + ck ≤ τ r, where τ > 1 is a parameter to relax the model further.
Both CPLEX and Gurobi have functionality built in to find a feasibility
relaxation of an infeasible problem, and this functionality is directly utilized in
SHOT. As Cbc lacks this functionality, repairing infeasible cuts are currently
not supported for this choice of subsolver.
Whenever the MILP solver returns with a status of infeasible, the repair
functionality of SHOT tries to modify the bounds in the constraints causing
the infeasibility. We do not want to modify the linear constraints that originate
from the original MINLP problem nor the variable bounds. In the MILP solvers’
repair functionality, it is possible to either minimize the sum of the numerical
modifications to the constraint bounds or the number of modifications; here
we have used the former. If it was possible to repair the problem, SHOT will
continue to solve the problem normally, otherwise SHOT will terminate with the
currently best known primal solution (if found).
In Fig. 2, we apply the repair functionality to the problem in Example 1.
3.2 Forcing Primal Updates Using a Cutoff Constraint
As seen in Fig. 2, the dual strategy in SHOT can get stuck in suboptimal solu-
tions. To try to force the dual strategy to search for a better solution, a primal
2 2 2
x2 x2 x2
1 1 1
c2
c1
0 0 0
2 4 6 x1 8 2 4 6 x1 8 2 4 6 x1 8
Fig. 2. The repair functionality is now applied to the infeasible MILP problem illus-
trated on the left so that the constraint c1 is relaxed and replaced with c2 . Thus,
an integer feasible solution (7.8, 1) to the updated MILP problem can be obtained as
illustrated in the middle and right figures.
2 2 2
c3
1 1 1
p p
c2 c2
0 0 0
2 4 6 8 2 4 6 8 2 4 6 8
Fig. 3. A primal cut p is now introduced to the MILP problem in the left figure, which
makes the problem infeasible as shown in the middle figure. The previously generated
cut c2 is therefore relaxed by utilizing the technique in Sect. 3.1 and replace with c3
to allow the updated MILP problem to have a solution (2.3, 1). This solution is better
than the previous primal solution (7.8, 1)! After this, we can continue to generate more
supporting hyperplanes to try to find an even better primal solution.
objective cut of the type f (x) ≤ γ is introduced (for a minimization problem).

The left-hand-side γ must force a solution to the dual MIP problem that is better
than the current best known primal bound PB.
Whenever SHOT has found a solution that it believes is the global solution,
it modifies the objective cut, so that its right-hand-side is less than the current
primal bound. The dual problem is then resolved with the MILP solver. The
problem will then either be infeasible (in which case the repair functionality
discussed in Sect. 3.1 will try to repair the infeasibility), or a new solution with
better objective value will be found. Note however, that this solution does not
need to be a new primal solution to the MINLP problem, since it is not required
to fulfill the nonlinear constraints, only their linearizations through hyperplane
cuts that have been included in the MILP problem. This procedure is then
repeated a user-defined number of times. Also, an objective cut update is forced
if a new primal bound has not been found in a specified number of iterations in
SHOT. This procedure, applied to Example 1, is exemplified in Fig. 3.
4 Automated Reformulations for Linearizing Special

Terms
Certain nonlinear terms in nonconvex MINLP problems can always be written in
an exact linear form by introducing new (binary) variables and linear constraints.
Depending on whether these are the only nonlinearities or nonconvexities in the
problem, the reformulated problem may either be a MILP problem or a convex
MINLP, both of which can then be solved to global optimality using SHOT.
The nonlinear terms that can currently be automatically reformulated in SHOT
are: bilinear terms with one or two integer variables and monomials with only
binary variables. Note that there are other, possible more efficient reformulations
available than those mentioned here, however at this stage, this has not fully been
investigated.
Reformulating bilinear terms with at least one binary variable A prod-
uct xi xj of a binary variable xi and a continuous or discrete variable xj , where
0 ≤ xj ≤ xj ≤ xj , can be exactly represented by replacing the term with the
auxiliary variable w and introducing the following linear constraints
xj xi ≤ w ≤ xj xi , w ≤ xj + xj (1 − xi ) and w ≥ xj − xj (1 − xi ).
Reformulating bilinear terms of two integers A product c · xi xj , where c

is a real coefficient, of two integer variables with bounds 0 ≤ xi ≤ xi ≤ xi and
0 ≤ xj ≤ xj ≤ xj can be exactly represented by replacing the term with an
auxiliary variable w. First binary variables bk for each discrete value the xi can
assume are introduced. The variables are furthermore constrained by
xi
xi

bk = 1, and xi = k · bk .
k=xi k=xi
It is beneficial to introduce the set of binaries for the variable with smaller
domain. Also, depending on whether c is negative or positive one of the following
constraints are required:

w − k · xj + xi xj · bk ≤ xi xj if c > 0,
∀k = xi , . . . , xi :
w − k · xj − xi xj · bk ≤ −xi xj if c < 0.
Reformulating monomials of binary variables A monomial term x1 · · · xN ,

where all xi are binary, can be reformulated into linear form by replacing the
term with the auxiliary variable w that assumes the value one if all xi are one
and zero otherwise. The relationship between w and xi ’s can be modeled as:

N ·w ≤ bi ≤ w + N − 1.
i
Table 1. The table shows the relative gaps (in %) as well as solution times (in seconds) for the global solvers Antigone and BARON and
the local solvers SHOT, Bonmin and DICOPT. For the global solvers, both the dual (DG) and primal (PG) gaps are shown, and for the
local solvers only the primal gap. Since SHOT is able to reformulate some of the instances into MILP form using the reformulations in
Sect. 4, the dual gaps are shown in these cases. A zero in the gap columns means that the gap is less than the termination criteria 0.1%,
and ‘∞’ that the respective bound was not found. The time limit used was 1800 s. The gaps were calculated with PAVER [3].
Instance Antigone BARON SHOT (nonconvex) SHOT (convex) Bonmin DICOPT

DG PG Time DG PG Time DG PG Time PG Time PG Time PG Time
autocorr bern25-13 10 0 1800 0 0 1096 0 0 630 4.7 1800 1.2 0.1 2.1 0.1
blend721 0 0 119 0 0 15 14 30 ∞ 1800 0 298 ∞ 569
carton7 0 0 470 6.5 0 1800 62 31 ∞ 1800 ∞ 1800 ∞ 1800
cecil 13 0 0 49 0 0 44 23 9.1 11 1.4 0 1800 ∞ 1800
edgecross14-078 11 0 1800 0 0 1518 >100 8.3 1800 8.4 1800 4.7 1.0 0 0.1
gasnet >100 0.7 1800 >100 0 1800 0 120 ∞ 0.4 ∞ 22 0 0.1
graphpart 3pm-0444-0444 0 0 299 7.7 0 1800 0 0 7.6 0 7.5 8.3 2.3 12 0.1
multiplants mtg1a 12 0 1800 0 0 440 0.7 577 ∞ 0.1 ∞ 23 0.7 0.2
nous1 0 0 231 0 0 161 0 1.1 ∞ 0.2 46 0.5 ∞ 0.1
oil ∞ ∞ 125 24 0 1800 9.5 82 ∞ 7.1 9.5 247 9.5 0.8
radar-3000-10-a-8 lat 7 >100 >100 1800 >100 >100 1800 >100 397 >100 1800 >100 1800 ∞ 2.0
sfacloc2 3 80 4.0 0.2 1800 0 0 43 1.3 1650 >100 1.2 2.0 1800 ∞ 0.1
sonet18v6 0 0 1094 0 0 134 0 0 176 0 207 0 693 ∞ 0.1
sonetgr17 9.1 0 1800 0 0 543 0 0 0.7 0 2.7 14 1800 26 32
sporttournament20 2.6 0 1800 0 0 15 0 0 3 0 3.0 5.5 0.5 20 0.1
squfl030-150persp 69 13 1800 8.5 0.3 1800 0 1800 >100 1800 ∞ 7.5 >100 8.8
sssd12-05persp 4.4 0 1800 38 0.2 1800 36 1 36 1.0 0 1800 0.3 3.5
tln4 0 0 0.9 0 0 1.1 0 0 1.2 ∞ 0.1 3.6 1800 37 0.6
tln5 0 0 0.3 0 0 12 32 0 1800 ∞ 0.1 3.9 1800 ∞ 3.7
tln6 0 0 0.5 0 0 2.0 37 0 1800 0.1 1.3 1800 76
On Solving Nonconvex MINLP Problems with SHOT
∞ ∞
tln7 0 0 608 0 0 648 >100 0 1800 ∞ 0.1 ∞ 3.9 ∞ 0.1
wastepaper4 27 0 1800 0 0 116 0 1800 >100 0.7 12 310 >100 10
455
waterno2 03 19 0 1800 0 0 795 0 11 ∞ 5.0 0 0.1 >100 0.2

5 Some Numerical Tests

To demonstrate the enhancements to SHOT detailed in the previous sections, we
have applied its convex and new nonconvex strategy to some nontrivial instances
selected from MINLPLib [17]. More precisely, these are the nonconvex MINLP
problems in the benchmark [19]. Additional tln*-instances were also included.
SHOT was also compared to the local solvers Bonmin (with its BB-strategy)
and DICOPT, as well as the global solvers Antigone and BARON in GAMS
25.1.2. Due to its automatic reformulations, SHOT is for some of the instances
a local and for some a global solver. The solvers used default settings except
for DICOPT, where the maximum number of cycles was set to 1000 to prevent
early termination. The comparison with the global solvers is not entirely fair,
since they actually do more work than the local solvers when proving optimality
of the solutions. However, we have mainly included them to indicate the diffi-
culty of the test set. The results in Table 1 show that Antigone and BARON
are good at finding the global primal solution, but that proving optimality is
time-consuming. Bonmin fails to find primal solutions on many instances, and
DICOPT seems to terminate too quickly on many problems, which could per-
haps have been prevented by increasing the maximum number of cycles further.
As expected, the nonconvex strategy in SHOT performs much better than the
convex one. Also, the reformulations in Sect. 4 enabled SHOT to find the right
primal solution for many instances and SHOT was even faster than BARON and
Antigone in some of these.
6 Conclusions
In this paper, some functionality for improving the stability of SHOT for non-
convex MINLP were described. With these modifications, the performance on
nonconvex problems is very good compared to the other local MINLP solvers
considered. The steps illustrated in this paper are, however, only a starting
point for further development, and the goal is to significantly increase the types
of MINLP problems that can be solved to global optimality by SHOT. To this
end, we intend to include convexification techniques based on lifting reformula-
tions for signomial and general twice-differentiable functions based on [14]. For
problems with a low to moderate number of nonlinearities, this might prove to
be a viable alternative to spatial branch and bound solvers.
References
1. Belotti, P., Lee, J., Liberti, L., Margot, F., Wächter, A.: Branching and bounds
tightening techniques for non-convex MINLP. Optim. Methods Softw. 24, 597–634
(2009)
2. Bonami, P., Lee, J.: BONMIN user’s manual. Numer. Math. 4, 1–32 (2007)
3. Bussieck, M.R., Dirkse, S.P., Vigerske, S.: PAVER 2.0: an open source environment
for automated performance analysis of benchmarking data. J. Glob. Optim. 59(2),
259–275 (2014)
4. GAMS: Solver manuals (2018). https://www.gams.com/latest/docs/S MAIN.html

5. Gleixner, A., Bastubbe, M., Eifler, L., Gally, T., Gamrath, G., Gottwald, R.L.,
Hendel, G., Hojny, C., Koch, T., Lübbecke, M.E., Maher, S.J., Miltenberger, M.,
Müller, B., Pfetsch, M.E., Puchert, C., Rehfeldt, D., Schlösser, F., Schubert, C.,
Serrano, F., Shinano, Y., Viernickel, J.M., Walter, M., Wegscheider, F., Witt,
J.T., Witzig, J.: The SCIP Optimization Suite 6.0. Technical report, Optimiza-
tion Online (July 2018)
6. Grossmann, I.E., Viswanathan, J., Vecchietti, A., Raman, R., Kalvelagen, E., et al.:
GAMS/DICOPT: a discrete continuous optimization package. GAMS Corporation
Inc (2002)
7. Guieu, O., Chinneck, J.W.: Analyzing infeasible mixed-integer and integer linear
programs. INFORMS J. Comput. 11(1), 63–77 (1999)
8. Javaloyes-Antón, J., Kronqvist, J., Caballero, J.A.: Simulation-based optimization
of chemical processes using the extended cutting plane algorithm. In: Friedl, A.,
Klemeš, J.J., Radl, S., Varbanov, P.S., Wallek, T. (eds.) 28th European Symposium
on Computer Aided Process Engineering, Computer Aided Chemical Engineering,
vol. 43, pp. 463–469. Elsevier (2018)
9. Kronqvist, J., Bernal, D., Lundell, A., Westerlund, T.: A center-cut algorithm for
quickly obtaining feasible solutions and solving convex MINLP problems. Comput.
Chem. Eng. (2018)
10. Kronqvist, J., Bernal, D.E., Lundell, A., Grossmann, I.E.: A review and comparison
of solvers for convex MINLP. Optim. Eng. 1–59 (2018)
11. Kronqvist, J., Lundell, A., Westerlund, T.: The extended supporting hyperplane
algorithm for convex mixed-integer nonlinear programming. J. Glob. Optim. 64(2),
249–272 (2016)
12. Kronqvist, J., Lundell, A., Westerlund, T.: Reformulations for utilizing separability
when solving convex MINLP problems. J. Glob. Optim. 1–22 (2018)
13. Lastusilta, T.: GAMS MINLP solver comparisons and some improvements to the
AlphaECP algorithm. Ph.D. thesis, Åbo Akademi University (2011)
14. Lundell, A., Westerlund, T.: Solving global optimization problems using refor-
mulations and signomial transformations. Comput. Chem. Eng. (2017). (available
online)
15. Lundell, A., Kronqvist, J., Westerlund, T.: The supporting hyperplane optimiza-
tion toolkit-a polyhedral outer approximation based convex MINLP solver utilizing
a single branching tree approach. Preprint, Optimization Online (2018)
16. Lundell, A., Skjäl, A., Westerlund, T.: A reformulation framework for global opti-
mization. J. Glob. Optim. 57(1), 115–141 (2013)
17. MINLPLib: Mixed-integer nonlinear programming library (2018). http://www.
minlplib.org/. Accessed 27 May 2018
18. Misener, R., Floudas, C.A.: ANTIGONE: algorithms for continuous/integer global
optimization of nonlinear equations. J. Glob. Optim. 59(2–3), 503–526 (2014)
19. Mittelmann, H.: Benchmarks for optimization software (2018). http://plato.asu.
edu/bench.html. Accessed 28 Jan 2019
20. Westerlund, T., Petterson, F.: An extended cutting plane method for solving con-
vex MINLP problems. Comput. Chem. Eng. 19, 131–136 (1995)
21. Zhou, K., Kılınç, M.R., Chen, X., Sahinidis, N.V.: An efficient strategy for the
activation of MIP relaxations in a multicore global MINLP solver. J. Glob. Optim.
70(3), 497–516 (2018)
Reversed Search Maximum Clique Algorithm
Based on Recoloring
Deniss Kumlander(&) and Aleksandr Porošin
TalTech, Ehitajate tee 5, 19086 Tallinn, Estonia

{deniss.kumlander,aleksandr.porosin}@ttu.ee
Abstract. This work concentrates on finding maximum clique from undirected

and unweighted graphs. Maximum clique problem is one of the most known
NP-complete problems, the most complex problems of NP class. A lot of other
problems can be transformed into clique problem, therefore solving or at least
finding a faster algorithm for finding clique will automatically help to solve lots
of other tasks. The main contribution of this work is a new exact algorithm for
finding maximum clique, which works faster than any currently existing algo-
rithm on a wide variety of graphs. The main idea is to combine a number of
efficient improvements from different algorithms into a new one. At first sight
these improvements cannot cooperate together, but a new approach of skipping
vertices from further expanding instead of pruning the whole branch allows to
use all the upgrades at ones. There will be some step-by-step examples with
explanations which demonstrate how to use a proposed algorithm.
Keywords: Graph theory Maximum clique
1 Introduction
A graph G is a representation of objects, which is a set of vertices V, and a number of

relationships between these objects, called edges i.e. a set of edges E. The order of G is
a number of vertices in G and the number of edges is called the size of G. Therefore,
order is |V| and |E| is equal to size of G. If two vertices u and v are connected to each
other they are called adjacent e ¼ uv 2 E ðGÞ and u and v are both incident to e. If
e 6¼ uv 2 EðGÞ then u and v are nonadjacent. The number of adjacent vertices or
neighbors of a vertex is called vertex degree deg(v). Vertex can be called even or odd if
its degree is even or odd. The maximum vertex degree of a graph G is denoted Δ(G).
Vertex support is a sum of degrees of all neighbors of a given vertex. Graphs can be
divided into directed and undirected. A directed graph D i.e. digraph has non sym-
metric arcs (directed edge is called arc), which means that vertex u can has relation to
vertex v, but there might not be relation from v to u. From the other hand, directed
graph always has symmetric relation between two vertices. Moreover graphs are
divided to weighted and unweighted. Weight is a number (generally non-negative
integer) assigned to each edge and can represent additional property like length of a
route, cost, required power, etc. depending on the problem context. On the opposite
side, edges of unweighted graph does not have weight or all their weights are equal to
one. Loop is an edge that connects a vertex to itself. Simple graph is an undirected

https://doi.org/10.1007/978-3-030-21803-4_46
Reversed Search Maximum Clique Algorithm Based on Recoloring 459
graph that does not contain any loops and there is no more than one edge connecting
two vertices. It should be noted that in this paper we are studying only unweighted
simple graphs. An undirected graph where all the vertices are adjacent to each other is
called complete. Otherwise, a graph with no edges is called edgeless, in other words no
two vertices are adjacent to each other. A clique is a complete subgraph of a graph
G and an independent set is an edgeless subgraph of G. Complement graph G’ of a
simple graph G is a graph that has the same vertex set, but the edge set consists only
from vertices that are not present G. G0 ¼ ðV; KnEÞ, where K is the edge set consisting
from all possible edges. Vertex cover of a graph G is a vertex set such that each edge of
G is incident to at least one vertex from this set. Graph coloring is process of assigning
labels i.e. colors to vertices with a special property that no two adjacent vertices can
share the same color. A color class is a set of vertices containing vertices with the same
color. It is clearly seen from coloring property that each color class is nothing more
than an independent set. Graph is called k-colorable if it can be colored into k colors.
The minimum number of colors required for coloring a graph G is called the chromatic
number - v(G) and in this case graph is called k-chromatic. Maximum clique problem –
a problem of finding maximum possible complete subgraph of a graph G. Solving
maximum clique problem or upgrading algorithms for finding maximum clique will not
only improve one specific, narrow problem but help to find better algorithms for all the
problems reducible to maximum clique problem.
2 Maximum Clique Algorithms Review

1. Carraghan and Pardalos – (Carraghan and Pardalos 1990) –classical branch and
bound algorithm;
2. Östergård’s algorithm (Östergård 2002) – based on back tracking;
3. VColor-BT-u – (Kumlander 2005). The idea is to apply initial vertex coloring
basing on the Östergård’s algorithm (Östergård 2002). It operates not with single
vertices but with independent sets to find the coloring that is used later to build
efficient bounding function.
4. MCQ - MCQ algorithm was firstly introduced in 2003 by Tomita and Seki (Tomita
and Seki 2003) and later Tomita and Kameda revised it with more computational
experiments in 2007 (Tomita and Kameda 2007). This algorithm bases on the
Carraghan and Pardalos idea (Carraghan and Pardalos 1990). Tomita and Seki
noted that a number of vertices of a maximum clique w(G) in a graph G = (V,E) is
always less or equal to the maximum degree Δ(G) plus 1 ðwðGÞ DðGÞ þ 1Þ. Using
this property they reworked an existing pruning formula.
MCR - “An efficient branch-and-bound algorithm for finding a maximum clique
with computational experiments” article published by Tomita and Kameda in 2007
(Tomita and Kameda 2007) introduced a new MCR algorithm, a successor of MCQ
algorithm. Compared to the older version, MCR mainly focused on initial sorting
and color numbering. Branch processing i.e. EXPAND function was not changed,
so we will spotlight only modified features and skip all the steps inherited from
MCQ.
460 D. Kumlander and A. Porošin
5. MCS - Three years later after MCR was released a new improvement for the same
algorithm appeared called MCS (Tomita et al. 2010). This time authors focused on
approximate coloring enhancements.
6. MCS improved - “Improvements to MCS algorithm for the maximum clique
problem” article was released in 2013 by Mikhail Batsyn, Boris Goldengorin,
Evgeny Maslov and Panos M. Pardalos (Batsyn et al. 2013). MCSI show very good
results on dense graphs using high-quality solution gained by ILS heuristic
algorithm.
3 New Algorithm
In this part we are going to introduce a new algorithm solving maximum clique
problem. It is called VRecolor-BT-u as this algorithm is a successor of VColor-BT-u
algorithm and it implements recoloring on each depth. There were multiple algorithms
described previously in this work. The idea of a new one is to gather and combine all
the gained knowledge to fasten maximum clique finding even more.
3.1 Description
The main idea of a new algorithm is to combine reversed search by color classes (from
VColor-BT-u) and in depth coloring i.e. recoloring (from MCQ and successors).
Before we can start there should be some useful properties from previous algorithms
noted:
1. Reversed search by color classes means searching for a clique in a constantly
increasing subgraph adding each color class one by one holding a cache b[] for each
color class, where cache is a maximum clique found by given color class. First of
all, we consider a subgraph S1 consisting only from vertices of a first color class C1.
After than
S subgraph
S S S2 is created with two color classes C1 and C2. In general
Si ¼ C1 C2 . . . Ci .
2. Pruning formula for reversed search by color classes is d 1 þ b½C ðvdi Þ jCBC j
can be used only if vertices in each subgraph Si are ordered by initial color classes
(using this color classes we are constructing a new subgraph on each iteration).
3. If vertices are ordered by their color numbers and are expanded starting from the
largest color number then all the vertices with color number lower than a threshold
ðth ¼ jCBC j ðd 1Þ can be ignored as they will not be expanded because of a
pruning formula d 1 þ MaxfNo½ pjp 2 Rg jCBC j.
4. Pruning formula d 1 þ MaxfNo½ pjp 2 Rg jCBC j can be used when we are
reapplying coloring on each depth and vertices are reordered with response to these
colors.
From this point it is seen that properties 2 and 4 are conflicting with each other as two
pruning formulas require different vertex ordering. As a result if both bounding rules
are used we are going to miss some cliques when a promising branch will be pruned.
To avoid such situations the formula d 1 þ MaxfNo½ pjp 2 Rg jCBC j was used not
to prune a branch but to skip a current vertex as expanding it is not going to give us a
better solution. This means that if vertices are recolored on each depth, but are not
ordered with response to new colors, we can skip a vertex without expanding it if and
only if its color number is lower than a current threshold and there is no neighbors of
this vertex with color number larger than threshold and who stand after the bound
gained from the first pruning formula d 1 þ b½C ðvdi Þ jCBC j.
There is an example on a Fig. 1 that shows how a conflict with two different
colorings is solved. Green lines show adjacency of two vertices (not all the adjacent
vertices are marked with green lines, but only two that are interesting for us in this
specific example). Let us assume that current depth is 2 and we have the following
prerequisites:
• d = 2 (depth is 2)
• jCBC j ¼ 3 (current best clique is 3)
• th ¼ 3 ð2 1Þ ¼ 2 (threshold taken from skipping formula, we need to expand
vertices having color number bigger than threshold)
• b½1 ¼ 1; b½2 ¼ 2; b½3 ¼ 3; b½4 ¼ 3 (cache values from previous iterations)
• bnd ¼ 2 (index of a rightmost vertex expanding which a pruning formula d
1 þ b½C ðvdi Þ jCBC j will prune current branch)
• Ca – array storing initial color classes, Cb – array storing in depth color classes
Fig. 1. Different coloring conflict in depth example
Let’s analyze the current example (picture 4.1). We start with the rightmost vertex h
with in depth color number 1 (No[h] = 1). We skip this vertex as long as its color
number is lower than a threshold (th = 2). As you can see vertex h might be contained in
a larger clique as it is connected with a vertex r (No[r] = 3), but we skip it anyway
because vertex r will be expanded later. Now we proceed with the next vertex t. Color
number of t is 1 (No[t] = 1), the same as vertex h has, but in this case it is not possible to
skip vertex t, because it is adjacent to vertex k (No[k] = 3). Vertex k stands after the
pruning bound (bnd = 2), therefore it will not be expanded at all. If we skip vertex t right
now we might possibly skip a larger clique, this means that vertex t should be expanded.
The next vertex to analyze is vertex a, we skip it as it’s in depth color number is equal to
the threshold (th = No[a] = 2) and there are no adjacent vertex standing after bound.
And the last expanded vertex on current depth is r (No[a] = 3) as its color number is
larger than the threshold. It should be noted that skipped vertices are not thrown away
from further considerations (when building the next depth), they should be stored in a
separate array and added to the next depth with preserved order. There is another
pruning formula used right after recoloring is done. As we already know number of
color classes obtained by coloring subgraph Gd is an upper bound for maximum clique
in a current subgraph. This property allows us to use the following pruning formula
d 1 þ cn jCBC j, where cn is a number of colors gained from recoloring.
3.2 Coloring Choice Based on Density

There are two coloring algorithms used in VRecolor-BT-u. They are both greedy, but
the first one is using swaps when coloring and the other one is not. Each time coloring
is applied we need to determine which algorithm to use. Moreover, there are two places
where we need to use coloring: initial coloring performed one time at the beginning of
the algorithm and in-depth coloring applied each time a new depth is constructed.
Coloring algorithm choice is made according to graph density using special constants,
they are 0.35 density for initial coloring and 0.55 density for in depth coloring. Con-
stants 0.35 and 0.55 were found using experimental results.
3.3 Algorithm
CBC – current best clique, largest clique found by so far; d – depth; c – index of the
currently processed color class; di – index of the currently processed vertex on depth d;
b – array to save maximum clique values for each color class; Ca – initial color classes
array; Cb – color classes array recalculated on each depth; Gd - subgraph of graph G
induced by vertices on depth d; cn – number of color classes recalculated on each
depth; CanBeSkipped( vdi ; c) - function that returns true if a vertex can skipped without
expanding it.
1. Graph density calculation. If graph density is lower than 35% go to step 2a, else
go to step 2b.
2. Heuristic vertex greedy coloring. There should be two arrays created to store
initial color classes defined only once (Ca) and color classes recalculated on each
depth (Cb). During this step both arrays must be equal.
a. Before coloring vertices are unordered and colored with swaps.
b. Before coloring vertices are in decreasing order with response to their degree
and colored without swaps.
3. Searching. For each color class starting from the first (current color class index c).
a. Subgraph (branch) building. Build the first depth selecting all the vertices
from color classes whose number c is equal or smaller than current. Vertices
from the first color class should stand first. Vertices at the end should belong to
c color class.
b. Process subgraph.
(1) Initialize depth. d = 1.
(2) Initialize current vertex. Set current vertex index di to be expanded (ini-
tially the first expanded vertex is the rightmost one). di ¼ nd .
(3) Bounding rule check. If current branch can possibly contain larger clique
than found by so far. If Caðvdi Þ\c and d 1 þ b½Caðvdi Þ jCBC j then
prune. Go to step 3.2.7.
(4) Vertex skipping check. If current vertex can possible contain larger clique
than found by so far. If d 1 þ Cbðvdi Þ jCBC j and CanBeSkipped( vdi ; c)
skip this vertex. Decrease index i = i − 1. Go to step 3.2.3.
(5) Expand current vertex. Form new depth by selecting all the adjacent
vertices (neighbors) to current vertex vdi (Gd þ 1 ¼ N ðvdi Þ). Set the next
expanding vertex on current depth di ¼ di 1:
(6) New depth analysis. Check if new depth contains vertices.
i. If Gd þ 1 ¼ ; then check if current clique is the largest one it must be
saved. Go to step 3.3.
ii. If Gd þ 1 6¼ ; then check graph density. If graph density is lower than
55% apply greedy coloring with swaps to Gd þ 1 , else use greedy col-
oring without swaps. Save number of color classes (cn) acquired by this
coloring. If number of color classes cannot possibly give us a larger
clique then prune. If d 1 þ cn jCBC j decrease index i = i − 1 and
go to step 3.2.3, else increase depth d = d + 1. Go to step 3.2.2.
(7) Step back. Decrease depth d = d − 1. Delete expanding vertex from the
current depth. If d = 0 go to step 3.3, else go to step 3.2.3.
(8) Complete iteration. Save current best clique value for this color. b[c] = |
CBC|.
4. Return maximum clique. Return CBC.
CanBeSkipped function
th – threshold from which branch will be pruned; CBC – current best clique, largest
clique found by so far; d – depth; c – index of the currently processed color class; di –
index of the currently processed vertex on depth d; bnd – bound from which vertices
cannot be skipped; b – array to save maximum clique values for each color class; Ca –
initial color classes array; Cb – color classes array recalculated on each depth.
1. Define threshold. th ¼ jCBC j d:

2. Find skipping bound. For each vertex index dj from di − 1 to 0. If Ca vdj \c and

b Ca vdj th then bnd = j.
3. Decide whether vertex can be skipped. For each adjacent (to currently expanded)

vertex with index dj from bnd to 0. If Cb vdj [ th then return false. If Cb vdj [ th
had never occurred return true.
4 Results
In this part we are going to compare the new algorithm to all the previously described
ones. The following algorithms take part in testing: Carraghan and Pardalos, Östergård,
VColor-u (Kumlander 2005), VColor-BT-u, MCQ, MCR, MCS, MCS Improved and
VRecolor-BT-u. The first part of tests is devoted to randomly generated graphs. These
random tests give a general overview of algorithms performance and therefore whether
a new algorithm is worth to be used for clique finding. All test cases are divided by
graphs density and for each density different algorithms are being tested. Algorithms
that perform much worse compared to others are removed from test results. The second
part contains analysis of algorithm results of DIMACS instances. Each DIMACS graph
has a special structure with response to some specific real problem. Four algorithms
were tested with this benchmark: MCS, MCSI, VColor-BT-u and VRecolor-BT-u.
4.1 Generated Test Results/Random Graphs

Next Figs. 2, 3 and 4 demonstrate that VRecolor-BT-u consumes the least amount of
time than the fastest of the rest algorithms on sparse graphs where density is lower than
40%. On graphs where density is very low (about 10%) basic algorithms (Carraghan
and Pardalos, Östergård) show really good results as they doesn’t perform any addi-
tional operations like coloring, searching for initial solution, reordering and so on.
8800
Ostergard
Time (ms)
6800
VColorBtu
4800
Mcq
2800
Mcr
800
800 880 960 1040 1120 1200 VRecolorBtu
Number of verƟces
Fig. 2. Randomly generated graphs test. Density 30%.
13900
11900
Mcq
Time (ms)
9900
7900 Mcr
5900 Mcs
3900 Mcsi
1900 VRecolorBtu
400 420 440 460 480 500
Number of verƟces

21000
16000
Time (ms)
Mcr
11000 Mcs
6000 Mcsi
VRecolorBtu
1000
125 130 135 140 145 150
Number of verƟces
Basic pruning formulas are really effective on such small density. Although
VRecolor-BT-u outperforms them proving that skipping technique gives overall pos-
itive impact even with a fact that algorithm needs to spend time for coloring and
proving that a vertex can be skipped. On densities from 20% to 40% the closest to
VRecolor-BT-u are results of MCQ and MCR but the new algorithm performs about
20–25% faster. Note that some algorithms are missing on certain densities: we filter out
those, which performance is exceptionally bad.
Based on randomly generated graph results we can conclude with the following
statements:
• Graphs with densities lower than 50% are best solved using VRecolor-BT-u
algorithm
• When graphs density is about 50%, there are three algorithms MCQ, MCR and
VRecolor-BT-u that are the fastest, but time consumption fluctuates a bit compared
to each other
• If density of graph lies between 55% and 75%, then VRecolor-BT-u algorithm is a
best choice
• For dense graphs with density more than 75% MCS Improved is fastest algorithm.
4.2 DIMACS Test Results

Here we test four algorithms on DIMACS graph instances: MCS, MCSI, VColor-BT-u
and VRecolor-BT-u. MCS and MCSI were chosen for testing because they demon-
strated the best results on DIMACS instances of all modern algorithms. VColor-BT-u
is a predecessor of VRecolor-BT-u and is the best candidate to be compared with a new
algorithm. In general DIMACS instances test proves results gained from randomly
generated graphs testing. VRecolor-BT-u algorithm works better on densities lower
than 75% (See Table 1).
Table 1. DIMACS graphs results. Time consumption (ms).

Graph Size Density Time (ms)
MCS MCSI VColor-BT-u VRecolor-BT-u
c-fat500-1 500 0.04 44 229 6 2
c-fat500-10 500 0.37 190 175 18 136
c-fat500-2 500 0.07 27 112 4 1
c-fat500-5 500 0.19 62 100 6 11
gen200_p0.9_44 200 0.90 3867 2103 140082 21045
gen200_p0.9_55 200 0.90 8988 98 3650 2276
hamming10-2 1024 0.99 496 50636 1290 61271
hamming6-2 64 0.90 0 1 0 1
hamming6-4 64 0.35 0 1 0 0
hamming8-2 256 0.97 10 38 22 245
hamming8-4 256 0.64 633 654 3 14
johnson16-2-4 120 0.76 702 800 244 581
johnson8-2-4 28 0.56 0 0 1 0
johnson8-4-4 70 0.77 1 3 0 0
keller4 171 0.65 85 97 133 73
MANN_a27 378 0.99 7201 291385 68105 10231
MANN_a9 45 0.93 0 2 0 0
p_hat1000-1 1000 0.24 2592 2788 3540 2046
p_hat300-1 300 0.24 35 78 15 12
p_hat300-2 300 0.49 108 109 464 238
p_hat300-3 300 0.74 17796 9200 161323 16421
p_hat500-1 500 0.25 110 198 139 91
p_hat500-2 500 0.50 4816 2613 24391 8539
p_hat700-1 700 0.25 386 453 239 236
san1000 1000 0.50 7270 4989 410 945
san200_0.7_1 200 0.70 18 39 889338 1819
san200_0.7_2 200 0.70 16 45 3 5
san200_0.9_1 200 0.9 2166 38 250 51
san200_0.9_2 200 0.9 265 35 2828 1402
san400_0.5_1 400 0.5 57 341 42 29
5 Summary
The main topic of this study was to develop a new improved algorithm for maximum
clique finding on undirected, unweighted graphs. The new maximum clique algorithm
called VRecolor-BT-u is demonstrated. This algorithm is constructed on the basis of
reversed search by color classes. The main idea is to apply coloring on each depth to
preserve the most up-to-date color classes and combine updated vertex colors with the
reversed search approach. At the first sight the idea of in depth recoloring might be
unclear as reversed search is built around initial color classes, but introduction of a new
skipping technique instead of pruning allows avoiding this conflict. Furthermore, there
are two different greedy coloring algorithms (with swaps and without swaps) used for
initial and in-depth coloring. Experimentally gained constants, which depend on graph
density, determine which coloring is applied. The new algorithm shows the best results
on the random graphs with low density and loses only on dense graphs to MCS and
MCSI algorithms specially designed for high densities. VRecolor-BT-u produces less
branches that its predecessor for all the DIMACS instances, but there are some cases
where the new algorithm consumes more time. Decreasing branch number resulting in
a performance degradation might be misleading at a glance, but can be described with a
simple fact that on some special cases additional in depth recoloring consumes a lot of
time while skipping technique is practically not working. In the result we have a
slightly lower branch number but increased time consumption. Finally, it was noted
that each graph should be solved by a different algorithm with response to graphs
density. On the low to mid densities it is advised to use VRecolor-BT-u algorithm
while the best option for dense graphs is MCS Improved algorithm.
References
Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of Np-
Completeness. Freeman, New York (2003)
Cook, S.A.: The complexity of theorem proving procedures. In: Proceedings of the 3rd
Annual ACM Symposium on Theory of Computing, pp. 151–158 (1971)
Carraghan, R., Pardalos, P.M.: An exact algorithm for the maximum clique problem. Oper. Res.
Lett. 9, 375–382 (1990)
Östergård, P.R.J.: A fast algorithm for the maximum clique problem. Discret. Appl. Math. 120,
197–207 (2002)
Kumlander, D.: Some Practical Algorithms to Solve the Maximum Clique Problem. Tallinn
University of Technology, Tallinn (2005)
Clarkson, K.: A modification to the greedy algorithm for vertex cover. Inf. Process. Lett. 16(1),
23–25 (1983)
Andrade, D.V., Resende, M.G.C., Werneck, R.F.: Fast local search for the maximum
independent set problem. J. Heuristics 18(4), 525–547 (2012)
Tomita, E., Seki, T.: An efficient branch-and-bound algorithm for finding a maximum clique. In:
Proceedings of the 4th International Conference on Discrete Mathematics and Theoretical
Computer Science, DMTCS’03. Springer-Verlag, Berlin, Heidelberg, pp. 278–289 (2003)
Tomita, E., Kameda, T.: An efficient branch-and-bound algorithm for finding a maximum clique
with computational experiments. J. Glob. Optim. 37(1), 95–111 (2007)
Tomita, E., Sutani, Y., Higashi, T., Takahashi, S., Wakatsuki, M.: A simple and faster branch-
and-bound algorithm for finding a maximum clique. In: Proceedings of the 4th International
Conference on Algorithms and Computation, WALCOM’10. Springer-Verlag, Berlin,
Heidelberg, pp. 191–203 (2010)
Batsyn, M., Goldengorin, B., Maslov, E., Pardalos, P.M.: Improvements to MCS Algorithm for
the Maximum Clique Problem. Springer Science+Business Media, New York (2013)
Sifting Edges to Accelerate the
Computation of Absolute 1-Center
in Graphs
Wei Ding1(B) and Ke Qiu2

1
Zhejiang University of Water Resources and Electric Power, Hangzhou 310018,
Zhejiang, China
dingweicumt@163.com
2
Department of Computer Science, Brock University, St. Catharines, ON, Canada
kqiu@brocku.ca
Abstract. Given an undirected connected graph G = (V, E, w), where

V is the set of n vertices, E is the set of m edges and each edge e ∈ E has
a positive weight w(e) > 0, a subset T ⊆ V of p terminals and a subset
E ⊆ E of candidate edges, the absolute 1-center problem (A1CP)
asks for a point on some edge in E to minimize the distance from it to T .
We prove that a vertex 1-center (V1C) is just an absolute 1-center (A1C)
if the all-pairs shortest paths distance matrix from the vertices covered
by the edges in E to T has a (global) saddle point. Furthermore, we
define the local saddle point of an edge and conclude that the candidate
edge having a local saddle point can be sifted. By combining the tool of
sifting edges with the framework of Kariv and Hakimi’s algorithm, we
design an O(m + pm∗ + np log p)-time algorithm for A1CP, where m∗ is
the number of the remaining candidate edges. Applying our algorithm
to the classic A1CP takes O(m + m∗ n + n2 log n) time when the distance
matrix is known and O(mn + n2 log n) time when the distance matrix is
unknown, which are smaller than O(mn + n2 log n) time and O(mn + n3 )
time of Kariv and Hakimi’s algorithm, respectively.
Keywords: Absolute 1-center · Sifting edges · Saddle point
1 Introduction
1.1 Previous Results
Let G = (V, E, w) be an undirected connected graph, where V is the set of n
vertices, E is the set of m edges, and w : E → R+ is a positive weight function
on edges. The vertex 1-center problem (V1CP) aims to find a vertex of G,
called a vertex 1-center (V1C), to minimize the longest distance from it to all the
other vertices. The V1CP is tractable and the whole computation is dominated
by finding all-pairs shortest paths in G, which can be done by using Fredman
and Tarjan’s O(mn + n2 log n)-time algorithm [3], or O(m∗ n + n2 log n)-time
https://doi.org/10.1007/978-3-030-21803-4_47
Sifting Edges to Accelerate the Computation of Absolute 469
algorithm by Karger et al. [7] where m∗ is the number of edges used by shortest
paths, or Pettie’s O(mn + n2 log log n)-time algorithm [9].
The classic absolute 1-center problem (A1CP) asks for a point on some
edge of G, called an absolute 1-center (A1C), to minimize the longest distance
from it to all the vertices of G. The A1CP was proposed by Hakimi [4], who
showed that an A1C of a vertex-unweighted graph G must be at either one of
2 n(n − 1) break points or one vertex of G. The A1CP admits polynomial-time
1
exact algorithms. For example, Hakimi et al. [5] presented an O(mn log n)-time
algorithm. In [8], Kariv and Hakimi first devised a linear-time subroutine to
compute a local center on every edge and then found an A1C by comparing all
the local centers. As a result, they developed an O(mn+n2 log n)-time algorithm
when the all-pairs shortest paths distance matrix is known and an O(mn + n3 )-
time algorithm when the distance matrix is unknown. The A1CP in vertex-
unweighted trees admits a linear-time algorithm. Moreover, the A1CP in vertex-
weighted graphs admits an O(mn2 log n)-time algorithm by Hakimi et al. [5]
and Kariv and Hakimi’s O(mn log n)-time algorithm [8]. For A1CP in vertex-
weighted trees, Kariv and Hakimi designed an O(n log n)-time algorithm [8]. For
more results on A1CP, we refer readers to [2,10] and references listed therein. The
A1CP has applications in the design of minimum diameter spanning subgraph
[1,6].
1.2 Our Results
In this paper, we consider the generalized version of the classic A1CP, formally
defined in Sect. 2, where we are asked to find an absolute 1-center from the given
subset of candidate edges with a goal of minimizing the longest distance from
the center to the given subset of terminals. Without otherwise specified, A1CP
is referred to as the generalized version in the remainder of this paper.
First, we prove that a V1C is just an A1C if the all-pairs shortest paths
distance matrix from the vertices covered by candidate edges to terminals has a
(global) saddle point. Next, we introduce the definition of the local saddle point
of edge, and prove that the local center on one edge can be reduced to one of its
endpoints if the edge has no local saddle point. In other words, the candidate
edges that have a local saddle point can be sifted. Moreover, we combine the
tool of sifting edges with the framework of Kariv and Hakimi’s algorithm to
design an O(m + pm∗ + np log p)-time algorithm for A1CP, where m∗ is the
number of the remaining candidate edges. Applying our algorithm to the classic
A1CP takes O(m + m∗ n + n2 log n) time when the distance matrix is known
as well as O(mn + n2 log n) time when the matrix is unknown, which reduces
O(mn + n2 log n) time and O(mn + n3 ) time of Kariv and Hakimi’s algorithm
to some extent, respectively.
Organization. The rest of this paper is organized as follows. In Sect. 2, we define
the notations and A1CP formally. In Sect. 3, we show the fundamental properties
which form the basis of our algorithm. In Sect. 4, we present our algorithm and
apply it to the classic A1CP. In Sect. 5, we conclude this paper.
470 W. Ding and K. Qiu
2 Definitions and Notations

Let G = (V, E, w) be an undirected connected graph with n vertices and m
edges. All the vertices are labelled by 1, 2, . . . , n in sequence, and all the edges
in E are labelled by 1, 2, . . . , m. For every 1 ≤ i ≤ n, we let vi denote the vertex
with index i. For every 1 ≤ k ≤ m, we let ek denote the edge with index k,
denote its weight by wk , and denote by e1k and e2k its two endpoints. Let τ (v) be
the index of v, for any v ∈ V .
For every (i, j) ∈ {1, 2, . . . , n}2 , we denote by π ∗ (vi , vj ) the vi -to-vj shortest
path in G, and denote by d(vi , vj ) the vi -to-vj shortest path distance (SPD),
i.e., d(vi , vj ) = e∈π ∗ (vi ,vj ) w(e). Let D = (di,j )n×n be the all-pairs shortest
paths distance matrix of G. Note that di,j = d(vi , vj ), ∀i, j, and so D is an n × n
symmetric matrix. Let I and J be two subsets of {1, 2, . . . , n}, and D(I, J )
be an |I| × |J | sub-matrix of D composed of the elements of D at all the
crossings of the rows with indices in I and the columns with indices in J . For
any I, J ⊆ {1, 2, . . . , n}, 2 ≤ |I|, |J | ≤ n, a (global) saddle point of D(I, J ) is
referred to as an index pair (i , j ) ∈ I × J such that
di ,j ≤ di ,j ≤ di,j , ∀i ∈ I \ {j }, j ∈ J \ {i }. (1)
Let T ⊆ {1, 2, . . . , n} be the index set of the given terminals and S ⊆

{1, 2, . . . , n} be the index set of the given candidate vertices. We also use T
and S to denote the set of terminals and candidate vertices, respectively. For
every i ∈ S, the distance from vi to T , φ(vi , T ), is referred to as the maximum
of all the vi -to-vj SPD’s, j ∈ T \ {i}, i.e.,
φ(vi , T ) = max d(vi , vj ), ∀i ∈ S. (2)

j∈T \{i}
The vertex 1-center problem (V1CP) aims to find a vertex vi∗ , called a
vertex 1-center (V1C), from S to minimize the value of φ(vi , T ). So,
φ(vi∗ , T ) = min φ(vi , T ). (3)

i∈S
Let Pk be the set of continuum points on edge ek , for any 1 ≤ k ≤ m, and P(G)
be the set of continuum points on all the edges of G. For any point p ∈ P(G),
if p is a vertex then τ (p) also denotes the index of vertex p and τ (p) is empty
otherwise. Let E ⊆ {1, 2, . . . , m} be the index set of the given candidate edges
and also let E denote the set of candidate edges. Let PE be the set of continuum
points on all the edges in E. Clearly,

PE = Pk . (4)
k∈E
For every point p ∈ PE , the distance from p to T , φ(p, T ), is referred to as the

maximum of all the p-to-vj SPD’s, j ∈ T \ {τ (p)}, i.e.,
φ(p, T ) = max d(p, vj ), ∀p ∈ PE . (5)

j∈T \{τ (p)}
The absolute 1-center problem (A1CP) asks for a point p∗ , called a absolute
1-center (A1C), from PE to minimize the value of φ(p, T ). So,
φ(p∗ , T ) = min φ(p, T ). (6)

p∈PE
Let SE be the index set of vertices covered by the edges in E, i.e.,
SE = {τ (e1k ), τ (e2k )|k ∈ E}. (7)
Also, SE denotes the corresponding subset of vertices. Let I = SE and J = T ,

then we extract D(SE , T ) from D.
Let I represent an instance of A1CP as follows: given an undirected connected
graph G = (V, E, w), a subset T ⊆ {1, 2, . . . , n} of terminals, a subset E ⊆
{1, 2, . . . , m} of candidate edges. By substituting E = SE into I, we get an
instance σ(I) of V1CP induced by I. Let opt(I) and opt(σ(I)) be an optimum
to I and σ(I), respectively. Specifically, we let Ik represent the special instance
of A1CP where E = {k}, 1 ≤ k ≤ m, and Sk denote the corresponding index set,
i.e., Sk = {τ (e1k ), τ (e2k )}.
For every j ∈ T , the distance from PE to vj , ψ(vj , PE ), is referred to as the
minimum of all the p-to-vj SPD’s over p ∈ PE \ {vj }. Let vj ∗ be the terminal
maximizing the value of ψ(vj , PE ). We have
ψ(vj , PE ) = min d(p, vj ), ∀j ∈ T , (8)

p∈PE \{vj }
and
ψ(vj ∗ , PE ) = max ψ(vj , PE ). (9)
j∈T
Similarly, ψ(vj , Pk ) = minp∈Pk \{vj } d(p, vj ), ∀k ∈ E. Let ψ(vj , SE ) denote the

distance from SE to vj , and vj ∗∗ be the terminal which maximizes the value of
ψ(vj , SE ). We have
ψ(vj , SE ) = min d(vi , vj ), ∀j ∈ T , (10)

i∈SE \{j}
and
ψ(vj ∗∗ , SE ) = max ψ(vj , SE ). (11)
j∈T
3 Fundamental Properties
In this section, we show several important properties which form the basis of
our algorithm.
Lemma 1. φ(vi∗ , T ) ≥ φ(p∗ , T ).
Proof. It follows directly from Eqs. (3), (6) and SE ⊂ PE .

Lemma 2. For every k ∈ E and j ∈ T , it always holds that
ψ(vj , Pk ) ≥ min{d(e1k , vj ), d(e2k , vj )|e1k , e2k = vj }. (12)
Lemma 3. ψ(vj ∗∗ , SE ) = ψ(vj ∗ , PE ).
Lemma 4. φ(p∗ , T ) ≥ ψ(vj ∗ , PE ).
Lemma 5. ψ(vj ∗∗ , SE ) = φ(vi∗ , T ) iff D(SE , T ) has a saddle point.
Theorem 1. If D(SE , T ) has a saddle point, then φ(p∗ , T ) = φ(vi∗ , T ).
Proof. When D(SE , T ) has a saddle point, we conclude from Lemmas 3, 4

and 5
φ(vi∗ , T ) = ψ(vj ∗∗ , SE ) = ψ(vj ∗ , PE ) ≤ φ(p∗ , T ).
Together with Lemma 1, we get φ(p∗ , T ) = φ(vi∗ , T ).
Immediately, we obtain the following corollary.
Corollary 1. Given an instance I of A1CP, an optimum to the induced instance

σ(I) of V1CP is also an optimum to I provided that D(SE , T ) has a (global)
saddle point.
4 A Faster Algorithm for A1CP

In this section, we first introduce a new definition, local saddle point of an edge,
and then employ the tool of sifting edges that have a local saddle point to speed
up Kariv and Hakimi’s algorithm [8] for the classic A1CP.
4.1 Sifting Edges

For every k ∈ E, the indices of two endpoints of edge ek are τ (e1k ) and τ (e2k ).
We extract the τ (e1k )-th and τ (e2k )-th rows of D(SE , T ) to obtain a 2 × |T | sub-
matrix D(Sk , T ). A local saddle point of ek is referred to as the endpoint of ek
with index ik such that there exists an index pair (ik , jk ) ∈ Sk × T satisfying
that
dik ,j ≤ dik ,jk ≤ di,jk , ∀i ∈ Sk \ {j }, j ∈ T \ {i }. (13)
Let E0 denote the index set of the candidate edges having no local saddle
point and E1 denote the index set of ones having a local saddle point. Clearly,
E = E0 ∪ E1 .
By Eq. (13), we obtain the following lemma.
Lemma 6. For every k ∈ E, ek has a local saddle point iff D(Sk , T ) has a

saddle point.
Theorem 2. Given an instance I of A1CP, if ek , k ∈ E, has a local saddle

point, then ek can be replaced with its two endpoints, e1k and e2k .
By Theorem 2, we only need to select e1k and e2k as the candidate vertices of
A1C when ek , k ∈ E, has a local saddle point. So, Corollary 2 follows.
Corollary 2. Given an instance I of A1CP and for every k ∈ E, an optimum

to the induced instance σ(Ik ) of V1CP is also an optimum to Ik provided that
ek has a local saddle point.
Furthermore, if ek , k ∈ E, has a local saddle point for all k ∈ E, then we only

need to select the endpoints of all the candidate edges as the candidate vertices
of A1C. So, Corollary 3 follows.
Corollary 3. Given an instance I of A1CP, an optimum to the induced instance

σ(I) of V1CP is also an optimum to I provided that every ek , k ∈ E, has a local
saddle point.
4.2 The Algorithm
It is well known that Kariv and Hakimi’s algorithm is the most popular one for
the classic A1CP. The fundamental framework of their algorithm is based on the
result that there is surely a classic A1C in the union of all the vertices of the
graph and the local centers of all the edges. By Corollaries 2 and 3, we claim for
A1CP that those candidate edges having a local saddle point can be sifted but
only their endpoints remain as the candidate vertices of A1C. In other words,
the edges in E1 can be omitted. In this subsection, we combine the framework of
Kariv and Hakimi’s algorithm with the tool of sifting candidate edges to design
a fast algorithm, named AlgA1CP, for A1CP, which consists of three stages.
Algorithm (AlgA1CP) for A1CP:
Input: an instance I of A1CP and distance matrix D;
Output: an A1C, p∗ .
01: Use DFS to traverse G to get SE , compute φ(vi , T )
02: and record ji∗ ← arg maxj∈T d(vi , vj ) for each i ∈ SE ;
03: i∗ ← arg mini∈SE φ(vi , T ); // (the index of V1C)
04: E0 ← ∅; SE0 ← ∅;
05: for every k ∈ E do
06: Determine whether or not ek has a local saddle point;
07: if ek has no local saddle point then
08: E0 ← E0 ∪ {k}; SE0 ← SE0 ∪ {τ (e1k ), τ (e2k )};
09: endif
10: endfor
11: if E0 = ∅ then
12: Return vi∗ ; // (i.e., p∗ = vi∗ )
13: else
14: for every i ∈ SE0 do
15: Sort d(vi , vj ), j ∈ T in a nonincreasing order;
16: endfor
17: for every k ∈ E0 do
18: Use Kariv and Hakimi’s subroutine [8] to find

19: a local center p∗k on ek , and record φ(p∗k , T );
20: endfor
21: k ∗ ← arg mink∈E0 φ(p∗k , T );
22: if φ(vi∗ , T ) ≤ φ(p∗k∗ , T ) then Return vi∗ ;
23: else Return p∗k∗ ; endif // (i.e., p∗ = p∗k∗ )
24: endif
First of all, we recall and introduce some useful definitions and notations. For
every 1 ≤ k ≤ m, a local center on ek , p∗k , is referred to as a point on ek such that
the value of φ(p, T ) is minimized, i.e., φ(p∗k , T ) = minp∈Pk φ(p, T ). An optimal
local center, p∗k∗ , is referred to as the one that minimizes the value of φ(p∗k , T ),
i.e., φ(p∗k∗ , T ) = mink∈E0 φ(p∗k , T ). Note that k ∗ is the index of the candidate
edge in E0 that has an optimal local center. Moreover, for every i ∈ SE0 , we
let ji∗ be the index of the terminal that maximizes the value of d(vi , vj ), i.e.,
d(vi , vji∗ ) = maxj∈T d(vi , vj ).
In the first stage (lines 1–3 in AlgA1CP), it takes O(m) time to obtain SE
by using depth first search (DFS) to traverse G. When the distance matrix D is
known, it takes O(|T |) time to compute φ(vi , T ) (i.e., find the maximum element
on the i-th row of D) and record the column index ji∗ , for every i ∈ SE . Then,
it takes O(|SE |) time to find a V1C, i.e., the vertex index i∗ such that the value
of φ(vi , T ) is minimized. So, the time cost of the first stage is O(m + |SE ||T |).
In the second stage (lines 4–10 in AlgA1CP), the major task is to record
E0 and SE0 . So, we need to determine whether or not ek has a local saddle
point, for every k ∈ E. It is implied by Lemma 6 that our practice can be
used to determine whether or not D(Sk , T ) has a saddle point. In detail, we
determine that D(Sk , T ) has a saddle point if d(vτ (e1k ) , vj ∗ 1 ) ≤ d(vτ (e2k ) , vj ∗ 1 )
τ (e ) τ (e )
k k
or d(vτ (e2k ) , vj ∗ ) ≤ d(vτ (e1k ) , vj ∗ ) and it has no saddle point otherwise. Such
τ (e2 ) τ (e2 )
k k
a decision takes O(1) time. Accordingly, the update of E0 and SE0 also takes
O(1) time. So, the time cost of the second stage is O(|E|).
In the third stage (lines 11–24 in AlgA1CP), we find a local center p∗k on ek
for every k ∈ E0 and then determine the optimal local center p∗k∗ . By comparing
φ(vi∗ , T ) and φ(p∗k∗ , T ), we obtain an A1C, p∗ . The main body of this stage is
to compute p∗k∗ . First, it takes O(|SE0 ||T | log |T |) time to sort d(vi , vj ), j ∈ T for
all i ∈ SE0 . Next, it takes O(|T |) time to apply Kariv and Hakimi’s subroutine
to the candidate edge ek to get p∗k , for every k ∈ E0 . So, it takes O(|E0 ||T |) time
to compute all p∗k , k ∈ E0 and then determine p∗k∗ . So, the time cost of the third
stage is O(|E0 ||T | + |SE0 ||T | log |T |).
The above three stages form our algorithm AlgA1CP for A1CP. Therefore,
the total time cost of AlgA1CP is
O(m + (|SE | + |E0 |)|T | + |SE0 ||T | log |T |). (14)

Since |SE | ≤ n and |SE0 | ≤ min{2|E0 |, n} ≤ n, we conclude that the total time
cost of AlgA1CP is at most
O(m + |E0 ||T | + n|T | log |T |). (15)
Theorem 3. Given an instance I of A1CP with G having n vertices and m

edges and |T | = p, AlgA1CP can find an A1C in at most O(m + pm∗ + np log p)
time, where m∗ is the number of candidate edges having no local saddle point,
when the distance matrix is known.
Now, we consider the special case of A1CP, where all the candidate edges
have a local saddle point, i.e., E0 = ∅. The time cost of applying AlgA1CP to this
special case is obtained by substituting |SE0 | = |E0 | = 0 and |SE | ≤ n into Eq.
(14), see Corollary 4.
Corollary 4. Given an instance I of A1CP with G having n vertices and m

edges and |T | = p, AlgA1CP can find an A1C in at most O(m + pn) time
provided that all the candidate edges have a local saddle point, when the distance
matrix is known.
4.3 Application
The classic A1CP is the special case of A1CP studied in this paper, where
T = {1, 2, . . . , n} and E = {1, 2, . . . , m}. Therefore, when the distance matrix is
known, we substitute p = n into Theorem 3 to obtain the time cost of apply-
ing AlgA1CP to the classic A1CP, see Theorem 4. Moreover, when the distance
matrix is unknown, we additionally use Pettie’s O(mn + n2 log log n)-time algo-
rithm [9] to get the matrix.
Theorem 4. Given an undirected connected graph G = (V, E, w) with n vertices

and m edges, AlgA1CP can find a classic A1C in at most O(m + m∗ n + n2 log n)
time when the distance matrix is known as well as in at most O(mn + n2 log n)
time when the distance matrix is unknown, where m∗ is the number of edges
having no local saddle point.
Next, we consider the special case of the classic A1CP where all the edges
have a local saddle point. We substitute p = n into Corollary 4 to obtain the
time cost of applying AlgA1CP to this special case. Similarly, when the distance
matrix is unknown, we use Pettie’s algorithm to get it.
Corollary 5. Given an undirected connected graph G = (V, E, w) with n ver-

tices and m edges, AlgA1CP can find a classic A1C in at most O(n2 ) time when
the distance matrix is known as well as in at most O(mn + n2 log log n) time
when the distance matrix is unknown provided that all the edges have a local
saddle point.
5 Conclusions
This paper studies the (generalized) A1CP in an undirected connected graph.
We examine an important property that if the distance matrix has a saddle point
then all the candidate edges can be sifted and only their endpoints remain as
the candidate vertices of A1C (i.e., a V1C is just an A1C), and further conclude
that every candidate edge having a local saddle point can be sifted. Based on
this property, we combine the tool of sifting edges with Kariv and Hakimi’s
subroutine for finding a local center of edge to design a faster exact algorithm for
the classic A1CP, which reduces O(mn + n2 log n) time of Kariv and Hakimi’s
algorithm to O(m + m∗ n + n2 log n) time, where m∗ is the number of edges
that have no local saddle point, when the distance matrix is known as well as
O(mn + n3 ) time of it to O(mn + n2 log n) time when the matrix is unknown,
respectively.
In this paper, we determine separately whether or not every candidate edge
can be sifted (i.e., has a local saddle point). This is a straightforward way. It
remains as a future research topic how to find a faster method of sifting candidate
edges to a larger extent.
References
1. Ding, W., Qiu, K.: Algorithms for the minimum diameter terminal steiner tree
problem. J. Comb. Optim. 28(4), 837–853 (2014)
2. Eiselt, H.A., Marianov, V.: Foundations of Location Analysis. Springer, Heidelberg
(2011)
3. Fredman, M.L., Tarjan, R.E.: Fibonacci heaps and their uses in improved network
optimization algorithms. J. ACM 34(3), 596–615 (1987)
4. Hakimi, S.L.: Optimum locations of switching centers and the absolute centers and
medians of a graph. Oper. Res. 12(3), 450–459 (1964)
5. Hakimi, S.L., Schmeichel, E.F., Pierce, J.G.: On p-centers in networks. Transport.
Sci. 12(1), 1–15 (1978)
6. Hassin, R., Tamir, A.: On the minimum diameter spanning tree problem. Info.
Proc. Lett. 53(2), 109–111 (1995)
7. Karger, D.R., Koller, D., Phillips, S.J.: Finding the hidden path: time bounds for
all-pairs shortest paths. SIAM J. Comput. 22(6), 1199–1217 (1993)
8. Kariv, O., Hakimi, S.L.: An algorithmic approach to network location problems.
I: the p-centers. SIAM J. Appl. Math. 37(3), 513–538 (1979)
9. Pettie, S.: A new approach to all-pairs shortest paths on real-weighted graphs.
Theor. Comp. Sci. 312(1), 47–74 (2004)
10. Tansel, B.C., Francis, R.L., Lowe, T.J.: Location on networks: a survey. Part I: the
p-center and p-median problems. Manag. Sci. 29(4), 482–497 (1983)
Solving an MINLP with Chance
Constraint Using a Zhang’s Copula
Family
Adriano Delfino(B)
UTFPR - Universidade Tecnológica Federal do Paraná, Pato Branco, Brazil

delfino@utfpr.edu.br
Abstract. In this work we describe a good approach to solve chance-

constrained programming with mixed-integer variables. We replace a
hard chance constrained function by a copula. We prove that Zhang’s
copula family satisfies the proprieties request by outer-approximation
and we use this algorithm to solve this problem with promising results.
Keywords: Mixed-integer programming ·

Chance-constrained programming · Copula
1 Introduction
In recent years, the stochastic programming community have been witnessed a
great development in optimization methods for dealing with stochastic programs
with mixed-integer variables [2]. However, there are only few works on chance-
constrained programming with mixed-integer variables [1,3,12,13]. In this work,
the problem of interest consists in nonsmooth convex mixed-integer nonlinear
programs with chance constraints (CCMINLP). These class of problems for
instance, can be solved by employing the outer-approximation technique. In
general, OA algorithms require solving less MILP subproblems than extended
cutting-plane algorithms [14], therefore the former class of methods is prefer-
able than the latter one. This justifies why we have chosen the former class of
methods to deal with problems of the type
min(x,y)∈X×Y f0 (x, y)
s.t. fi (x, y) ≤ 0, i = 1, . . . , mf − 1 (1)
P [g(x, y) ≥ ξ] ≥ p,
where the functions fi : IRnx × IRny → IR , i = 0, . . . , mf − 1, are convex but

possibly nonsmooth, g : IRnx × IRny → IRm is a concave function, X ⊂ IRnx is
a polyhedron, Y ⊂ ZZ ny contains only integer variables and both X and Y are
compacts sets. Furthermore, ξ ∈ IRm is the random vector, P is the probability
measure associated to the random vector ξ and p ∈ (0, 1) is a given parameter.
We assume that P is a 0−concave distribution (thus P is α−concave for all
α ≤ 0). Some examples of distribution functions that satisfies the 0−concavity
https://doi.org/10.1007/978-3-030-21803-4_48
478 A. Delfino
property are the well-known multidimensional Normal, Log-normal, Gamma and

Dirichlet distributions [9]. Under these assumptions, the following function is
convex
fmf (x, y) = log(p) − log(P [g(x, y) ≥ ξ]). (2)
As a result, (1) is a convex (but possibly nonsmooth) MINLP problem fitting
usually notation:
fmin := min f0 (x, y) s.t. fi (x, y) ≤ 0 , i ∈ Ic := {1, . . . , mf } . (3)

(x,y)∈X×Y
Due to the probability function P [g(x, y) ≥ ξ], evaluating the constraint (2)
and computing its subgradient is a difficult task: for instance, if P follows a
multivariate normal distribution, computing a subgradient of P [g(x, y) ≥ ξ]
requires numerically solving m integrals of dimension m − 1. If the dimension
m of ξ is too large, then creating a cut for function log(p) − log(P [g(x, y) ≥ ξ])
is computationally challenging. In this situation, it makes sense to replace the
probability measure by a simpler function.
In this manner, this work proposes to approximate the hard chance constraint
P [g(x, y) ≥ ξ] ≥ p by a copula C:
C(Fξ1 (g1 (x, y)), Fξ2 (g2 (x, y)), . . . , Fξm (gm (x, y)) ≥ p. (4)
In addition to the difficulties present in MINLP models, we recall that the con-
straint function (4) can be nondifferentiable. Our main contribution in this paper
was to prove that the Zhang’s copula family is log concave and with this result we
get a approximation to Problem (3) which is possible to solve in the reasonable
time.
This work is organized as follows: in Sect. 2 we briefly review some basic about
Copulae and proved that the Zhang’s family satisfies the condition to use the
outer-approximation algorithm develop in [4]. In Sect. 3, a review about outer-
approximation is presented. In Sect. 4, we describe a tool problem coming from
power management energy, in Sect. 5 we present some preliminary numerical
solution of this problem and finally, in Sect. 6 we give a short conclusion.
2 Copulae: A Bird’s Eye View
When dealing with chance-constrained programs it is, very often, impossible to

get an explicit formula for the probability measure P because the jointly distri-
bution of ξ variable is unknown. However, it is easier to estimate the marginal
distributions Fξ1 , . . . , Fξm . In what follows, the random variable ξ ∈ IRm will
supposed to have known marginal distributions Fξ1 , . . . , Fξm . In order to model
the dependence among theses marginals a copula function will be employed. The
concept of copula was introduced by Sklar [11] in 1959, when he was studying
the relationship between a multidimensional probability function and its lower
dimensional marginals.
Solving an MINLP with Chance Constraint Using a Zhang’s Copula Family 479
Definition 1. An m−dimensional copula is a function C : [0, 1]m → [0, 1] that

satisfies the following properties: (i) C(1, . . . , 1, u, 1, . . . , 1) = u ∀u ∈ [0, 1],
(ii) C(u1 , . . . , ui−1 , 0, ui+1 , . . . , um ) = 0 and (iii) C is quasi monotone on [0, 1]m .
In other words, the above definition means that C is an m−dimensional distri-
bution function with all univariate marginals being uniform in the interval [0, 1].
The item (iii) means that the C−volume of any box in [0, 1]m is nonnegative (see
[8] for more details).
Given a random vector ξ with known margins Fξi , i = 1, . . . , m, an important
tool proved by Sklar [11] is a theorem that assures the existence of a copula that
approximates the cumulative distribution F .
Theorem 1. Let Fξ be a m−dimensional distribution function with marginals
Fξ1 , Fξ2 , . . . , Fξm . Then there exists a m−dimensional copula C such for all
z ∈ IRm ,
Fξ (z1 , z2 , . . . , zm ) = C(Fξ1 (z1 ), Fξ2 (z2 ), . . . , Fξm (zm )). (5)
If Fξi , i = 1, . . . , m are continuous, then C is unique. Otherwise, C is uniquely

determined in the image of Fξ . Conversely, if C is a copula and Fξi , . . . , Fm are
distribution functions, then the function Fξ defined by (5) is a m−dimensional
distribution function with marginals Fξ1 , . . . , Fξm .
In the above theorem, functions Fξi , i = 1, . . . , m can be different. This a prop-
erty of particular interest in energy problems where power generation can be
produced by several renewable (and uncertainty) sources of different nature: the
probability distribution governing the amount of water arriving into a reservoir
(of a hydroplant) can be very different from the probability distribution of the
wind speed nearby an eolian park. Observe that this theorem is not constructive,
it just ensures the existence of a copula associated to the distribution Fξ (z). In
most of the cases, a copula providing the equality
C(Fξ1 (z1 ), . . . , Fξm (zm )) = Fξ (z)
must be estimated. The problem of choosing/estimating a suitable copula has

been receiving (from the statistical community) much attention in the last few
years [8]. As shown in book [8], there are many copulae in the literature.
By applying “log” in the inequality (4) the following function is obtained
fm (x, y) = log(p) − log C(Fξ1 (g1 (x, y)), Fξ2 (g2 (x, y)), . . . , Fξm (gm (x, y)), (6)
where Fξi is the marginal probability distribution of Fξ (z) = P [z ≥ ξ], which

is assumed to be known. The function given by (6) is well defined by Sklar’s
theorem [Theorem 1]. If C is 0−concave, then (3) can be approximated by the
convex MINLP (which can be nonsmooth)
min(x,y)∈X×Y f0 (x, y)
s.t. fi (x, y) ≤ 0, i = 1, . . . , mf − 1
log(p) − log(C(Fξ1 (g1 (x, y)), Fξ2 (g2 (x, y)), . . . , Fξm (gm (x, y))) ≤ 0.
(7)
480 A. Delfino
There is a family of copula with this property, introduced by Zhang [15]. The
family is given by
r
a
C(u1 , . . . , um ) = min (ui j,i ), (8)
1≤i≤m
j=1
r
where aj,i ≥ 0 and j=1 aj,i = 1 for all i = 1, . . . , m. Different choices of
parameters aj,i give different copulae, all of them nonsmooth functions, but
with subgradient easily computed via chain rule. We proved that this family of
copula is a log concave.
Theorem 2. Let ξ ∈ IRm be a random vector with all marginals Fξi , i =
1, . . . , m being 0−concave functions. Suppose that g : IRnx × IRny → IRm is
a concave function. Consider a Zhang’sCopula C given in (8) for any choice of
r
parameters aj,i satisfying aj,i ≥ 0 and j=1 aj,i = 1 for all i = 1, . . . , m. Then
C(Fξ1 (g1 (x, y)), Fξ2 (g2 (x, y)), . . . , Fξm (gm (x, y)))
is α−concave for α ≤ 0.
Proof. Given a pair (x, y) ∈ IRnx ×IRny we set z = (x, y) to simplify the notation.
Let z1 = (x1 , y1 ), z2 = (x2 , y2 ) and z = λz1 + (1 − λ)z2 with λ ∈ [0, 1]. As the
function g is concave, then for all i = 1, . . . , m
gi (λz1 + (1 − λ)z2 ) ≥ λgi (z1 ) + (1 − λ)gi (z2 ). (9)
As Fξi , i = 1, . . . , m, are increasing functions, by applying Fξi to inequality (9)

we get
Fξi (gi (λz1 + (1 − λ)z2 )) ≥ Fξi (λgi (z1 ) + (1 − λ)gi (z2 )). (10)
By applying log in the above inequality,
log(Fξi (gi (λz1 + (1 − λ)z2 ))) ≥ log(Fξi (λgi (z1 ) + (1 − λ)gi (z2 ))). (11)
Functions Fξi are 0−concave by hypothesis. Then
log(Fξi (λgi (z1 ) + (1 − λ)gi (z2 ))) ≥ λ log(Fξi (gi (z1 ))) + (1 − λ) log(Fξi (gi (z2 ))).
(12)
By gathering inequality (11) and (12) we have
log(Fξi (gi (z))) ≥ log(λFξi (gi (z1 )) + (1 − λ)Fξi (gi (z2 )))
(13)
≥ λ log(Fξi (gi (z1 ))) + (1 − λ) log(Fξi (gi (z2 ))).
The Zhang’s Copula evaluated at the point z = λz1 + (1 − λ)z2 is

r a
C(Fξ1 (g1 (z), . . . , Fξm (gm (z)) = j=1 min1≤i≤m [Fξi (gi (λz1 (1 − λ)z2 ))] j,i ,
where aj,i ≥ 0.
To simply the notation, we write Fξ1 (g1 (z), . . . , Fξm (gm (z) as Fξ (g(z)). So,
r a
log C(Fξ (g(z))) = j=1 log (min1≤i≤m [Fξi (gi (z))] j,i ) .
As the log function is increasing, log min u = min log u, and therefore
r
aj,i
log C(Fξ (g(z))) = min [log (Fξi (gi (z)) ].
1≤i≤m
j=1
As aj,i ≥ 0, the equality becomes

r

log C(Fξ (g(z))) = min [aj,i log (Fξi (gi (z))] .
1≤i≤m
j=1
By using (13) in the above equality we get

r

log C(Fξ (g(z))) ≥ min aj,i [λ log(Fξi (gi (z1 ))) + (1 − λ) log(Fξi (gi (z2 )))] .
1≤i≤m
j=1
The right-most side of the above inequality is greater or equal than

r r
j=1 min1≤i≤m aj,i [λ log(Fξi (gi (z1 )))] + j=1 min1≤i≤m aj,i [(1 − λ) log(Fξi (gi (z2 )))]
r aj,i
r aj,i
=λ j=1 min1≤i≤m [log(Fξi (gi (z1 ))) ] + (1 − λ) j=1 min1≤i≤m [log(Fξi (gi (z2 ))) ]
r aj,i
r aj,i
= λ[log j=1 min1≤i≤m (Fξi (gi (z1 ))) ] + (1 − λ)[log j=1 min1≤i≤m (Fξi (gi (z2 ))) ]
= λ log C(Fξ (g(z1 ))) + (1 − λ) log C(Fξ (g(z2 ))).
It was then demonstrated that
log C(Fξ (g(λz1 + (1 − λ)z2 ))) ≥ λ log C(Fξ (g(z1 ))) + (1 − λ) log C(Fξ (g(z2 ))),
i.e., the log C(Fξ1 (g1 (z)), . . . , Fξm (gm (z))) is a concave function. In other words,
the copula C is α−concave for α ≤ 0.
3 Outer Approximation
A important method for solving (3) is the outer-approximation algorithm given

in [5] and further extended in [7]. The method solves a sequence of nonlinear
and mixed linear subproblems, as described below. At iteration k, the method
fixes the integer variable yk and tries to solve, in the continuous variable x, the
following subproblem:
min f0 (x, yk ) s.t. fi (x, yk ) ≤ 0 , i ∈ Ic . (14)

x∈X
If this subproblem is feasible, a feasible point to problem (3) is found and,

k
therefore, an upper bound fup for its optimal value. On the other hand, if (14)
is infeasible the method solves the feasibility subproblem:
min u s.t. fi (x, yk ) ≤ u , i ∈ Ic . (15)

x∈X,u≥0
482 A. Delfino
With a solution of (14) or (15), the linearization of f0 and fi can be used to

approximate problem (3) by the following MILP:
⎧
⎪
⎪ minr,x,y r
⎪
⎪ k
⎨ s.t. r < fup
k
flow := f0 (x , y j )+ < s0 , (x − xj , y − y j ) >≤ r, j ∈ T k
j
⎪
⎪
⎪
⎪ fi (xj , y j )+ < si , (x − xj , y − y j ) >≤ 0, j ∈ T k ∪ S k , i ∈ Ic
⎩
r ∈ IR, x ∈ X, y ∈ Y ,
(16)
where si , i = 0, 1, . . . , m are subgradients at the point (xj , y j ) and index sets T k

and S k are defined as T k := {j ≤ k : subproblem (14) was feasible at iteration j}
and S k := {j ≤ k : subproblem (14) was infeasible at iteration j}. Under con-
vexity and differentiability (in this case, the subdifferencial is a singleton which
k
contains the gradient vector) of underlying functions, the optimal value flow of
(16) is a lower bound on the optimal value of (3). Moreover, the y−part solution
of (16) is the next integer iterate y k+1 . The algorithm stops when the difference
between upper and lower bounds provided respectively by (14) and (16) is within
a given tolerance > 0.
The outer approximation algorithm was revisited in 1992 in [10], where the
authors proposed a LP/NLP based on the branch and bound strategy in which
the explicit solution of a MILP master problem is avoided at each major iter-
ation k. In our case, the underlying functions might not be differentiable, but
subdifferentiable. As pointed out in [6], replacing gradients by subgradients in
the classic OA algorithm entails a serious issue: the OA algorithm is not conver-
gent if the differentiability assumption is removed. In order to have a convergent
OA algorithm for nonsmooth convex MINLP one needs to compute lineariza-
tions (cuts) in (16) by using subgradients that satisfy the KKT system of either
subproblem (14) or (15), see [4,6].
Computing solutions and subgradients satisfying the KKT conditions of the
nosmooth subproblems is not a trivial task. For instance, the Kelley cutting-
plane method and subgradients methods for nosmooth convex optimization prob-
lems are ensured to find an optimal solution, but are not ensured to provide a
subgradient satisfying the KKT system. Given an optimal solution xk of (14),
there might be infinitely many subgradients of f at xk if f is nonsmooth. How
would a specific subgradient can be chosen in order to satisfy the underlying
KKT system? The answer for this question was closed by the authors in paper
[4] in which they proposed a regularized outer-approximation method capable
to solve the nonsmooth MINLP problems.
4 A Power System Management Problem

In this section we consider a power system management model similar to the one
given in [1]. The model consists of planning (in a horizon of few hours) a small
system composed of a hydro plant and a wind farm. Our model is constructed
via realistic data from the Brazilian energy system.
4.1 Problem’s Description
Consider a power management model consisting of a hydro power plant and a

wind farm. Electricity that is generated by both units has two purposes: first
attend the local community power demand and secondly the leftover is sold on
the market. The energy that is generated by the wind farm is designated to
supply the local community demand only. If it is not enough then the remain-
ing demand is covered by the hydro power plant. The residual energy portion
generated by the hydro power plant is then sold to the market with the aim of
maximizing the profit, which varies according to the given energy price. Since the
intention is to consider a short time planning period (e.g. one day) the assump-
tion is that the only uncertainty in this energy planning problem comes from the
wind power production. As a result the approach will consider the inflow to the
hydro plant, market prices and energy demand as known parameters. The hydro
plant counts with a reservoir that can be used to store water and adapt the water
release strategy to better achieve profit according the price versus demand: the
price of electricity varies during the day, thus it is convenient to store water (if
possible) to generate electricity at moments of the day deemed more profitable.
In order to exclude production strategy that can be optimum in a short
period of time and can harm the near future energy supply (e.g. the planner can
be willing to use all water in the reservoir to produce energy to maximize profit
because the energy prices are higher and in the next hour there is no enough
water to produce energy in case the wind farm is failing to supply the local
community leading to a blackouts), a level constraint is imposed for the final
water level in the hydro reservoir i.e. it cannot be lower of a certain level l∗ .
The decision variables of the problem are the leftover energy to supply the
local community and the residual energy to be sold to the market (both gen-
erated by the hydro power plant). Since the main purpose of the problem is to
maximize the profit for the power plant owner then the objective function is
profit maximization. Some of the constraints of this problem are simple bounds
of water release which are given by the operational limits of the turbine (usually
provided per the manufacturer), lower and upper bounds of hydro reservoir fill-
ing level and demand satisfaction. As in the paper [1], the demand satisfaction
constraint will be dealt with in a probabilistic manner: random constraints in
which a decision has to be taken prior to the observation of the random variable
are not well-defined in the context of an optimization problem. This motivates
the formulation of a corresponding probabilistic constraint in which a decision
is defined to be feasible if the underlying random constraint is satisfied under
this decision at least with a certain specified probability p.
A further characteristic of this model is to consider binary decision variables.
These variables are needed because turbines cannot be operated using an arbi-
trary level: they are either off or on (working in a positive level). Such on/off
constraints are easily modeled by binary variables. By discretizing the time hori-
zon (one day) into T intervals (hours), the resulting optimization problem is
described below:
484 A. Delfino
T
maxx,y,z t=1 πt zt
s.t. P [xt + ξt ≥ dt ∀t = 1, . . . , T ] ≥ p
yt v ≤ xt + zt ≤ yt v̄ ∀t = 1, . . . , T
xt , zt ≥ 0, yt ∈ {0, 1} ∀t = 1, . . . , T (17)
t ¯
l ≤ l0 + tω − χ τ =1 (xτ + zτ ) ≤ l ∀t = 1, . . . , T
1
t
l0 + T ω − χ1 τ =1 (xτ + zτ ) ≥ l∗ ,
where zt is the residual energy which is produced by the hydro power plant in
time interval t that is sold to market, πt is the energy price in the time t, xt is
the amount of energy generated by hydro power plant to supply the remaining
demand on local community on time t, dt is the local community demand on
time t which is assumed to be known (due to the short planning horizon of one
day), ξt is the random energy generated by the wind farm on time t p ∈ (0, 1]
is the given parameter to ensure confidence level for the demand satisfaction, v
and v̄ are the lower and upper respectively operations limits of the hydro power
plant turbine, yt is the binary variable modeling turbine turn on/turn off, l0 is
the initial water level of the hydro power plant reservoir at the beginning of the
horizon, l and ¯l are the lower and upper water levels respectively in the hydro
power plant reservoir at any time, ω denotes the constant amount of water inflow
to the hydro power plant reservoir at any time t, χ represents a conversion factor
between the released water and the energy produced by the turbine: one unit of
water released corresponds to χ units of energy generated, l∗ is the minimum
level of water into the hydro power plant reservoir in the last period T of the
time horizon and P is the probability measure associated to random vector ξ. As
in [1], we assume that the wind power generation follows a multivariate normal
distribution with mean vector μ and a positive definite correlation matrix Σ.
This assumption assures that problem (3) is convex. Theorem 1 secures that
we can replace the probability measure P by a Zhang’s copula C and finally,
Theorem 2 confirm that Problem (7) are also convex and consequently, we can
use outer-approximation to solve them.
4.2 Problem’s Data

The demand considered in this problem was extracted from the ONS website
(www.ons.org.br). In our numerical tests, the considered daily demands corre-
sponds to eighty percent of averaged demand Southern region of Brazil, divided
by the number of cities in such a region. Note that we have only one city (or
region) and two sources of energy: hydro power plant and wind farm. The price
πt of energy is directly proportional to the demand and varies between 166.35
and 266.85 by MW\h.
The configuration of hydro power plant reservoir is mirrored from [1] and in
this problem is set as l = 5000 hm3 (cubic hectometre), ¯l = 10000 hm3 and l0 =
l∗ = 7500 hm3 . The amount of water inflow is a constant ω = 2500 hm3 each hour
and the conversion factor is χ = 0.0035 MWm3 . When the turbines are turned on,
the minimum power generation is 5 MW/h and the maximum generation is 20

MW/h. The marginal distributions Fξi follows a normal distribution with mean
μi and variance ξi .
Numerical experiments were performed on a computer with Intel(R) Core(TM),
i7-5500U, CPU @ 2.40 GHz, 8G (RAM), under Windows 10, 64 Bits and we
coded in/or called from matlab version 2017a our algorithm from [4].
We solved the power system management problem for T = 24 (one day) using
the Zhang’s copula to approximate the probability function. Then we checked
the probability constraint of (17). In this case for dimension T = 24 was not
possible to solve Problem (17) within one hour CPU time, given the considered
computer and softwares.
One of difficulties in using copulae is to find its coefficients that model with
accuracy the probability constraint. The parameters of Zhang’s Copula depend
on the size of the problem. If the random vector ξ has dimension T then the
r
number of parameters is 1+rT : r and aj,i ≥ 0 with j=1 aj,i = 1 ∀i = 1, . . . , T .
In this work we do not focus on the best choice of the Copula parameters yet.
For now, we simply set r = 8 and the coefficients aj,i was generated following
a uniform probability distribution with low sparseness. As shown below, this
simple choice gives satisfactory results.
These nonsmooth convex mixed-integer programs are solved with the follow-
ing solvers, coded in/or called from matlab version 2017a:
– OA: it is an implementation of the outer-approximation Algorithm given in [4]

(the classic algorithm).
– OA1 : as solver OA, with integer iterates defined by solving the regularized MILP
subproblem (16) with the
1 norm. The stability center was set as the current
iterate and the prox parameter as μk = 10 for all k;
– OA∞ : as solver OA1 , with the
1 norm replaced by
∞ ;
– OA2 : as solver OA1 , with the
1 norm replaced by
2 . In this case, the subprob-
lem defining the next iterate is no longer a MILP, but a MIQP;
We solved 21 problems, each of them based a day of week and using the value
for p as 0.8, 0.9 and 0.95.
Table 1 shows the performance those algorithms in all problems. Although
some problems were not resolved in the required time (one hour) by OA, the
regularized ones solved all problems with satisfactory results. It is important to
say that if we use the probability function instead of copula then no problem is
solved by the algorithms in the time limit of one hour.
486 A. Delfino
Table 1. Number of MILPs and CPU time for p ∈ {0.8, 0.9, 0.95}
Day OA OA1 OA∞ OA2

k CPU k CPU k CPU k CPU
p = 0.8 Tuesday 1175 3395.79 39 141.08 539 1485.71 37 113.72
Wednesday 502 1327.55 44 140.20 147 349.76 47 102.97
Thursday 967 2958.82 57 174.08 452 1235.68 52 151.84
Friday 908 3600.89∗ 41 173.59 781 2971.03 48 152.14
Saturday 31 36.27 9 9.08 53 45.53 11 12.41
Sunday 197 407.31 13 21.33 79 79.91 13 19.61
Monday 653 3605.96∗ 23 96.57 1073 3347.86 28 87.82
p = 0.9 Tuesday 1204 3600.92∗ 32 179.63 390 1225.37 34 107.71
Wednesday 533 1510.74 41 152.42 157 465.13 52 120.90
Thursday 1031 3425.71 73 213.68 472 1329.76 58 176.13
Friday 820 3603.04∗ 42 146.48 793 3203.98 43 128.13
Saturday 29 37.92 9 10.44 33 27.35 11 15.92
Sunday 778 3582.22 15 46.50 461 1239.08 14 35.07
Monday 646 3609.38∗ 23 96.59 373 1288.81 28 86.89
p = 0.95 Tuesday 1165 3601.44∗ 32 204.71 394 1310.39 23 112.16
Wednesday 717 2129.51 56 181.47 398 1081.85 51 127.46
Thursday 1032 3600.68∗ 49 219.76 374 1280.56 41 99.64
Friday 771 3605.22∗ 42 123.34 789 3045.20 48 110.58
Saturday 38 94.58 9 19.13 14 22.51 10 20.93
Sunday 761 3401.13 15 48.91 123 356.63 14 45.96
Monday 622 3600.82∗ 22 98.68 986 3580.10 28 84.72
Sum 14580 15.20 h 686 0.69 h 8881 8.05 h 691 0.53 h
∗
the limit time was reached
6 Conclusion
In this work we show that using copulas is an excellent alternative to solve
problems involving probability restrictions. In future work we will move in this
direction and improve our numerical results.
References
1. Arnold, T., Henrion, R., Möller, A., Vigerske, S.: A mixed-integer stochastic non-
linear optimization problem with joint probabilistic constraints. Pac. J. Optim. 10,
5–25 (2014)
2. Birge, J., Louveaux, F.: Introduction to Stochastic Programming. Springer, New
York (2011)
3. de Oliveira, W.: Regularized optimization methods for convex MINLP problems.
TOP 24, 665–692 (2016)
4. Delfino, A., de Oliveira, W.: Outer-approximation algorithms for nonsmooth con-

vex MINLP problem. Optimization 67, 797–819 (2018)
5. Duran, M.A., Grossmann, I.E.: An outer-approximation algorithm for a class of
mixed-integer nonlinear programs. Math. Program. 36, 307–339 (1986)
6. Eronen, V.P., Mäkelä, M.M., Westerlund, T.: On the generalization of ECP and
OA methods to nonsmooth convex MINLP problems. Optimization 63, 1057–1073
(2014)
7. Fletcher, R., Leyffer, S.: Solving mixed integer nonlinear programs by outer approx-
imation. Math. Program. 66, 327–349 (1994)
8. Nelsen, R.B.: An Introduction to Copulas. Springer, New York (2006)
9. Prékopa, A.: Stochastic Programming. Dordrechet (1995)
10. Quesada, I., Grossmann, I.E.: An LP/NLP based branch and bound algorithm for
convex MINLP optimization problems. Comput. Chem. Eng. 16, 937–947 (1992)
11. Sklar, A.: Fountions de rapartition á dimentisions et leurs marges. Publications
and Institut de Statistique de Paris 8, 229–231 (1959)
12. Song, Y., Luedtke, J., Küçükyavuz, S.: Chance-constrained binary packing prob-
lems. INFORMS J. Comput. 26, 735–747 (2014)
13. Vielma, J.P., Ahmed, S., Nemhauser, G.L.: Mixed integer linear programming
formulations for probabilistic constraints. Oper. Res. Lett. 40, 153–158 (2012)
14. Westerlund, T., Pettersson, F.: An extended cutting plane method for solving
convex MINLP problems. Comput. Chem. Eng. 19, 131–136 (1995)
15. Zhang, Z.: On approximation max-stable process and constructing extremal copula
functions. In: Statistical Inference for Stochastic Process, vol. 12, pp. 89–114 (2009)
Stochastic Greedy Algorithm Is Still
Good: Maximizing Submodular +
Supermodular Functions
Sai Ji1 , Dachuan Xu1 , Min Li2 , Yishui Wang3(B) , and Dongmei Zhang4
1
College of Applied Sciences, Beijing University of Technology, Beijing 100124,
People’s Republic of China
jisai@emails.bjut.edu.cn, xudc@bjut.edu.cn
2
School of Mathematics and Statistics, Shandong Normal University,
Jinan, People’s Republic of China
liminEmily@sdnu.edu.cn
3
Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences,
Shenzhen 518055, People’s Republic of China
ys.wang1@siat.ac.cn
4
School of Computer Science and Technology, Shandong Jianzhu University,
Jinan 250101, People’s Republic of China
zhangdongmei@sdjzu.edu.cn
Abstract. In this paper, we consider the problem of maximizing the

sum of a submodular and a supermodular (BP) function (both are non-
negative) under cardinality constraint and p-system constraint respec-
tively, which arises in many real-world applications such as data science,
machine learning and artificial intelligence. Greedy algorithm is widely
used to design an approximation algorithm. However, in many appli-
cations, evaluating the value of the objective function is expensive. In
order to avoid a waste of time and money, we propose a Stochastic-
Greedy (SG) algorithm, a Stochastic-Standard-Greedy (SSG) algorithm
as well as a Random-Greedy (RG) for the monotone BP maximization
problems under cardinality constraint, p-system constraint as well as the
non-monotone BP maximization problems under cardinality constraint,
respectively. The SSG algorithm also works well on the monotone BP
maximization problems under cardinality constraint. Numerical experi-
ments for the monotone BP maximization under cardinality constraint
is made for comparing the SG algorithm and the SSG algorithm in the
previous works. The results show that the guarantee of the SG algorithm
is worse than the SSG algorithm, but the SG algorithm is faster than
SSG algorithm, especially for the large-scale instances.
Keywords: BP maximization · Stochastic greedy · Approximation

algorithm
Supported by Higher Educational Science and Technology Program of Shandong

Province (No. J17KA171) and Natural Science Foundation of China (Grant Nos.
11531014, 11871081, 61433012, U1435215).
https://doi.org/10.1007/978-3-030-21803-4_49
Stochastic Greedy Algorithm Is Still Good: Maximizing 489
1 Introduction
Maximizing a submodular function subject to independence constraint is related
to many machine learning and data science applications such as information
gathering [13], document summarization [14], image segmentation [12], and PAC
learning [16]. There are many studies about variants of submodular maximization
[2,4,8,11].
However many subset selection problems in data science are not always sub-
modular [22]. In this paper, we study the constrained maximization of an objec-
tive that may be decomposed into the sum of a submodular and a supermodular
function. That is, we consider the following problem:
arg max h(X) = f (X) + g(X),

X∈C
where f is a normalized submodular function and g is a normalized supermodu-

lar function. We call this problem submodular-supermodular (BP) maximization
problem, and f + g a BP function [1]. We say a function h admits a BP decom-
position if ∃f, g such that h = f + g, where f is a submodular function and g is
a supermodular function. Not all the monotone functions are BP decomposition
and there are some instances of BP problem that can not be approximately solved
to any positive factor in polynomial time [1]. Thus, in this paper, we just study
the BP maximization problems, which can be decomposed and approximated.
When g is modular, there is an extensive literature on submodular maximiza-
tion problem [6,10,17,21]. If h is monotone, the greedy algorithm is guaranteed
to obtain an (1 − e−1 )-approximation subject to the cardinality constraint [17],
1
and this result is known to be tight. It also achieves a p+1 -approximation for
p matroids [9]. Based on the definition of curvature, the greedy algorithm has
a K1h (1 − e−Kh ) guarantee for the cardinality constraint and a Kh1+p guaran-
tee for the matroid constraint [7]. The Stochastic-Greedy algorithm achieves a
(1 − e−1 − ) guarantee for the monotone submodular maximization with car-
dinality constraint [15]. If h is non-monotone, the Random-Greedy algorithm
achieves a e−1 -approximation subject to the cardinality constraint [5].
When g is not modular, there are also some good jobs about non-submodular
maximization problem [3,18,20]. Especially, when g is supermodular, Bai et
g
al. [1] provide a K1f [1 − e−(1−K )Kf ]-approximation algorithm for the mono-
g
(1−K )
tone BP maximization problem under cardinality constraint and a (1−K g )K +p -
f
approximation algorithm for the monotone BP maximization problem under p
matroid independence constraint.
In this paper, we consider the monotone BP maximization problem subject
to cardinality constraint, the monotone BP maximization problem subjected to
p-system constraint and the non-monotone BP maximization problem subjected
to cardinality constraint, where both of the submodular and the supermodular
functions are non-negative. For each problem, we provide a stochastic greedy
algorithm and give the corresponding theoretical analysis.
490 S. Ji et al.
2 Preliminaries
Given a set V = {v1 , v2 , . . . , vn }, denote fv (X) = f (X ∪ {v}) − f (X) as the
marginal gain of adding the item v to the set X ⊂ V . A function f is monotone
if for any X ⊆ Y , f (X) ≤ f (Y ). Without loss of generality, we assume that
monotone functions are normalized, i.e., f (∅) = 0.
Definition 1. (Submodular curvature [7]) The curvature Kf of a submodular
function f is defined as
fv (V \{v})
Kf = 1 − min .
v∈V f (v)
Definition 2. (Supermodular curvature [7]) The curvature Kg of a supermodu-
lar function g is defined as
g(v)
Kg = 1 − min .
v∈V gv (V \{v})
From Definitions 1 and 2, we have that 0 ≤ Kf , Kg ≤ 1. In this paper, we
study the case that 0 < Kf < 1 and 0 < Kg < 1.
3 Algorithms
In this section, we provide the SG algorithm, the SSG algorithm and the RG
algorithm for the monotone BP maximization problem subject to cardinality
constraint, monotone BP maximization problem subject to p-system constraint
and non-monotone BP maximization problem subject to cardinality constraint,
respectively. These three algorithms are alternative processes. In each iteration,
the SG algorithm samples an alternative set of size nk ln 1 uniformly and ran-
domly from the set V \current solution, then chooses the item from above alter-
native set with the maximum marginal gain. The SSG algorithm selects one item
whose marginal gain is at least ξ ∈ (0, 1] times of the largest marginal gain value,
where ξ ∈ (0, 1] comes from a distribution D. While the RG algorithm chooses
an item uniformly and randomly from a set of size k with largest summation of
individual marginal gain value.
The detailed algorithms are showed as follows.
3.1 SG Algorithm
Similar to Lemma 2 of [15], we have the following lemma, which estimate the
lower bound of the expected gain in the (i+1)-th step and reveal the relationship
between the current solution S and the optimal solution S ∗ .
Lemma 1. Let t = nk ln 1 . The expected gain of Algorithm 1 at the (i + 1)-th
step is at least
1−
hs (Si ),
|S ∗ \Si | ∗
s∈S \S
where S ∗ is the optimal solution and Si is the subset obtained by Algorithm 1

after i steps.
Algorithm 1 SG algorithm for monotone BP maximization with cardinality

constraint
Input: a monotone submodular function f , a monotone supermodular function g, a
ground set V , and a budget k
Output: a subset S of V with k items
1: S ← ∅, i ← 1
2: while i ≤ k do
3: R ← a random subset obtained by sampling t random items from V \S
4: si ← arg maxs∈R hs (S)
5: S ← S ∪ {si }, i ← i + 1
6: end while
7: return S
Lemma 2. Let Si (0 ≤ i ≤ k) be the subsets obtained by Algorithm 1 after i-th

step, respectively. Then the following holds for all 0 ≤ i ≤ k − 1,

h(S ∗ ) ≤ Kf aj + aj
j:sj ∈Si \S ∗ j:sj ∈Si ∩S ∗
k − |S ∗ ∩ Si |
+ ai+1 ,
(1 − Kg )(1 − )
where S ∗ is the optimal solution, {si } = Si \Si−1 , ai = E[hsi (Si−1 )], Kf is the
curvature of submodular function f , and Kg is the curvature of supermodular
function g.
Combining Lemma C.1, Lemma D.2 of [1] and Lemmas 1–2, we obtain the
following theorem to estimate the approximation ratio of SG algorithm.
Theorem 1. Let t = nk ln 1 . For monotone BP maximization problem with
cardinal constraint, Algorithm 1 finds a subset S ⊆ V with |S| = k and
1 g

E[h(S)] ≥ 1 − e−(1−K )Kf − h(S ∗ ).
Kf
where Kf and Kg are the curvature of submodular function f and the curvature
of supermodular function g, respectively.
3.2 SSG Algorithm

k−1
Lemma 3. [9] For δi , ρi ≥ 0 with 0 ≤ i ≤ t − 1, if it satisfies that i=0 δi ≤ t
k−1 k−1
for 1 ≤ t ≤ k and ρi−1 ≥ ρi for 1 ≤ i ≤ k − 1, then i=0 δi ρi ≤ i=0 ρi .
492 S. Ji et al.
Algorithm 2 SSG algorithm for monotone BP maximization with p-system

constraint
Input: a monotone submodular function f , a monotone supermodular function g, an
independence system (V, I) and a distribution D
Output: a base of V
1: S ← ∅, U ← V
2: repeat
3: U ← {v ∈ U |S ∪ {v} ∈ I}
4: if U = ∅ then
5: ξ ← randomly sampled from D
6: s∗ ← an arbitrary item from U s.t.
hs∗ (S) ≥ ξ · maxs∈U hs (S)
7: end if
8: until U = ∅
9: return S
Similar to Theorem 3 of [19]. We get the following theorem based on Lemma

3 to estimate the approximation ratio of SSG algorithm.
Theorem 2. For monotone BP maximization problem with p-system constraint,

Algorithm 2 finds a basis S of V with
(1 − Kg )2 μ
E[h(S)] ≥ h(S ∗ ),
p + (1 − Kg )2 μ
where S ∗ , Kf , Kg and μ is the optimal solution, the curvature of submodular
function f , the curvature of supermodular function g and the expectation of
ξ ∼ D, respectively.
3.3 RG Algorithm
Lemma 4. Let h = f + g, f is a submodular function and g is a supermodular
function. Denote A(p) as a random subset of A where each element appears with
probability at most p (not necessarily independently). Then we have
E[h(A(p))] ≥ [1 − (1 − Kg )p]h(∅),
where Kg is the curvature of supermodular function g.
From Lemma 4, we get the following lemma which is crucial to the analysis
of RG algorithm.
Lemma 5. Let h = f + g, f is a submodular function and g is a super-

E[h(S ∗ ∪ {si })]
modular function. For any 1 ≤ i ≤ k, we have ≥ 1 − (1 −
h(S ∗ )
Kg ) 1 − (1 − k1 )i , where Kg is the curvature of supermodular function g and si
is the item obtained by Algorithm 3 at the i-th step.
Algorithm 3 RG algorithm for non-monotone BP maximization with cardinal-

ity constraint
Input: a submodular function f , a supermodular function g, function f + g is non-
monotone, a ground set V , and a budget k
Output: a subset S of V with k items
1: S ← ∅, i ← 1
2: while i ≤ k do
3: Mi ⊆ V \S be the subset of size k maximizing s∈Mi hs (S)
4: si ← uniformly and randomly chosen from Mi
5: S ← S ∪ {si }, i ← i + 1
6: end while
7: return S
Theorem 3. For non-monotone BP maximization problem with cardinality con-

straint, Algorithm 3 finds a subset S ⊆ V with |S| = k and
E[h(S)] ≥ (1 − Kg )e−1 h(S ∗ ),
where S ∗ is the optimal solution and Kg is the curvature of supermodular func-

tion g.
Proof. Denote Si as the subset obtained by Algorithm 3 after i steps. For each
1 ≤ i ≤ k, consider a set Mi containing the elements of S ∗ ∪ Si−1 plus enough
dummy elements to make the size of Mi exactly k. Recall Algorithm 3 and
Lemma C.1 of [1]. We have

E[hsi (Si )] = k −1 hs (Si−1 ) ≥ k −1 hs (Si−1 )
s∈Mi s∈Mi

= k −1 hs (Si−1 )
S ∗ \Si−1
h(S ∗ ∪ Si−1 ) − h(Si−1 )

≥ (1 − Kg )[ ].
k
Combing Lemma 5, we have
E[h(Si )] = E[h(Si−1 )] + E[hsi (Si−1 )]

(1 − Kg )[h(S ∗ ∪ Si−1 ) − h(Si−1 )]
≥ E[h(Si−1 )] + (1)
k
1 − Kg 1 − Kg 1
= (1 − )E[h(Si−1 )] + (1 − )i−1 h(S ∗ ).
k k k
494 S. Ji et al.
By (1), we have
k
k−1
1 − Kg 1 − Kg
E[h(S)] = E[h(Sk )] ≥ 1 − h(∅) + 1−
k k
k−2 k−3 2
1 − Kg 1 1 − Kg 1
+ 1− 1− + 1− 1− +
k k k k
k−1
1 1 − Kg
... + 1 − h(S ∗ )
k k

k−1
1 − Kg 1
≥ k 1− h(S ∗ )
k k
≥ (1 − Kg )e−1 h(S ∗ ).
The theorem is proved.
In this section, we make some numerical experiments to compare the Stochastic-
Greedy algorithm (Algorithm 1) and the Stochastic-Standard-Greedy algorithm
(see the reference [1]) for the BP maximization problems subject to cardinality
constraint. We use the same model in Bai et al. [1]. In this model, the ground
set V is partitioned into V1 = {v1 , . . . , vk } and V2 = V \V1 . The submodular
function is defined as

k − α|S ∩ V2 | |S ∩ V2 |
f (S) := λ · wi + ,
k k
i:vi ∈S
and the supermodular function is defined as

g(S) := (1 − λ) · |S| − β min(|S ∩ V1 | + 1, |S|, k) +

β
ε max(|S|, |S| + (|S ∩ V2 | − k + 1)) ,
1−β
where α ∈ (0, 1], β ∈ [0, 1), λ ∈ [0, 1], and ε = 10−5 . It is easy to see that the
curvatures of f and g are α and β respectively.
In our experiments, we set n = 300, k = 150, α = 0.05, 0.1, 0.15, . . . , 1,
β = 0, 0.05, 0.1, . . . , 0.95, and λ = 0.1, 0.3, . . . , 0.9. For the Stochastic-Greedy
algorithm, we repeat 10 runs and compute the average value for each setting.
Results of guarantee are shown in Fig. 1, and running times are shown in Table 1.
It is not surprise that the Stochastic-Standard-Greedy algorithm performs bet-
ter than the Stochastic-Greedy algorithm since the Stochastic-Standard-Greedy
algorithm use a larger candidate set in each iteration. However, the Stochastic-
Greedy algorithm is faster than the Stochastic-Standard-Greedy algorithm, espe-
cially for the large size, the gap of running time between these two algorithm
=0.1 =0.3 =0.5

1 1 1
0.9 0.9
objective value
0.8
0.8 0.8
0.6
0.7 0.7
0.4 0.6 0.6

1 1 1
0.5 0.5 0.5
0 0 0.5 1 0 0 0.5 1 0 0 0.5 1
=0.7 =0.9
1 1
0.9
0.9
stochastic greedy
0.8 standard greedy
0.8
0.7
0.6 0.7
1 1
0.5 0.5
0 0 0.5 1 0 0 0.5 1
Fig. 1. Guarantees of the Stochastic-Standard-Greedy and Stochastic-Greedy

algorithms
is very big. Moreover, it is interested that for the fixed λ and Kg , lower Kf
yields lower gap of value between the two algorithms, and larger λ (means the
function h is closer to be submodular) yields lower gap of value between the two
algorithms for the big Kf .
Table 1. Running times of the Stochastic-Standard-Greedy and Stochastic-Greedy

algorithms
Size λ Minimal time (s) Maximal time (s) Average time (s)
Standard Stochastic Standard Stochastic Standard Stochastic
greedy greedy greedy greedy
n = 300 0.1 8.2265 0.0923 8.7303 0.0969 8.4784 0.0943
k = 150 0.3 8.2101 0.0921 8.7409 0.0988 8.4771 0.0943
0.5 8.2262 0.0923 8.6891 0.0981 8.4800 0.0943
0.7 8.2150 0.0921 8.6225 0.0969 8.4808 0.0943
0.9 8.2512 0.0922 8.7408 0.1136 8.4816 0.0944
5 Conclusion
In this paper, we consider the monotone BP maximization problem subject to
cardinality constraint and p-system constraint, respectively. Then, we consider
the non-monotone BP maximization problem subjected to cardinality constraint.
496 S. Ji et al.
For each problem, we give a stochastic algorithm. The theoretical analysis indi-
cates that the stochastic algorithms work well on the BP maximization prob-
lems. Numerical experiments shows the the algorithms are effective. There are
two possible future research problems. One is to design a better stochastic algo-
rithm to solve BP maximization problem subjected to cardinality constraint or
p-system constraint. Another direction is to study other variants of constrained
submodular maximization problem.
References
1. Bai, W., Bilmes, J.A.: Greed is still good: maximizing monotone submodular+
supermodular functions (2018). arXiv preprint arXiv:1801.07413
2. Bian, A., Levy, K., Krause, A., Buhmann, J.M.: Continuous dr-submodular maxi-
mization: structure and algorithms. In: Advances in Neural Information Processing
Systems, pp. 486–496 (2017)
3. Bian, A.A., Buhmann, J.M., Krause, A., Tschiatschek, S.: Guarantees for
greedy maximization of non-submodular functions with applications (2017). arXiv
preprint arXiv:1703.02100
4. Bogunovic, I., Zhao, J., Cevher, V.: Robust maximization of non-submodular
objectives (2018). arXiv preprint arXiv:1802.07073
5. Buchbinder, N., Feldman, M., Naor, J.S., Schwartz, R.: Submodular maximization
with cardinality constraints. In: Proceedings of the Twenty-Fifth Annual ACM-
SIAM Symposium on Discrete Algorithms, pp. 1433–1452 (2014)
6. Chekuri, C., Vondrák, J., Zenklusen, R.: Submodular function maximization via
the multilinear relaxation and contention resolution schemes. SIAM J. Comput.
43(6), 1831–1879 (2014)
7. Conforti, M., Cornuéjols, G.: Submodular set functions, matroids and the greedy
algorithm: tight worst-case bounds and some generalizations of the rado-edmonds
theorem. Discret. Appl. Math. 7(3), 251–274 (1984)
8. Epasto, A., Lattanzi, S., Vassilvitskii, S., Zadimoghaddam, M.: Submodular opti-
mization over sliding windows. In: Proceedings of the 26th International Conference
on World Wide Web, pp. 421–430 (2017)
9. Fisher, M.L., Nemhauser, G.L., Wolsey, L.A.: An analysis of approximations for
maximizing submodular set functions - II. Polyhedral Combinatorics, pp. 73–87
(1978)
10. Iwata, S., Orlin, J.B.: A simple combinatorial algorithm for submodular function
minimization. In: Proceedings of the Twentieth Annual ACM-SIAM Symposium
on Discrete Algorithms, pp. 1230–1237 (2009)
11. Kawase, Y., Sumita, H., Fukunaga, T.: Submodular maximization with uncertain
knapsack capacity. In: Latin American Symposium on Theoretical Informatics, pp.
653–668 (2018)
12. Kohli, P., Kumar, M.P., Torr, P.H.: P3 & beyond: move making algorithms for
solving higher order functions. IEEE Trans. Pattern Anal. Mach. Intell. 31(9),
1645–1656 (2009)
13. Krause, A., Guestrin, C., Gupta, A., Kleinberg, J.: Near-optimal sensor placements:
maximizing information while minimizing communication cost. In: Proceedings of
the 5th International Conference on Information Processing in Sensor Networks,
pp. 2–10 (2006)
14. Lin, H., Bilmes, J.: A class of submodular functions for document summarization,
pp. 510–520 (2011)
15. Mirzasoleiman, B., Badanidiyuru, A., Karbasi, A., Vondrák, J., Krause, A.: Lazier
than lazy greedy. In: AAAI, pp. 1812–1818 (2015)
16. Narasimhan, M., Bilmes, J.: Pac-learning bounded tree-width graphical models.
In: Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence,
pp. 410–417 (2004)
17. Nemhauser, G.L., Wolsey, L.A., Fisher, M.L.: An analysis of approximations for
maximizing submodular set functions - I. Math. Prog. 14(1), 265–294 (1978)
18. Niazadeh, R., Roughgarden, T., Wang, J.R.: Optimal algorithms for continu-
ous non-monotone submodular and dr-submodular maximization (2018). arXiv
preprint arXiv:1805.09480
19. Qian, C., Yu, Y., Tang, K.: Approximation guarantees of stochastic greedy algo-
rithms for subset selection. In: Proceedings of the Twenty-Seventh International
Joint Conference on Artificial Intelligence IJCAI, pp. 1478–1484 (2018)
20. Schoenebeck, G., Tao, B.: Beyond worst-case (in) approximability of nonsubmod-
ular influence maximization. In: International Conference on Web and Internet
Economics, pp. 368–382 (2017)
21. Sviridenko, M.: A note on maximizing a submodular set function subject to a
knapsack constraint. Oper. Res. Lett. 32(1), 41–43 (2004)
22. Wei, K., Iyer, R., Bilmes, J.: Submodularity in data subset selection and active
learning. In: International Conference on Machine Learning, pp. 1954–1963 (2015)
Towards Multi-tree Methods
for Large-Scale Global Optimization
Pavlo Muts(B) and Ivo Nowak
Hamburg University of Applied Sciences, Hamburg, Germany

{pavlo.muts,ivo.nowak}@haw-hamburg.de
Abstract. In this paper, we present a new multi-tree approach for solv-

ing large scale Global Optimization Problems (GOP), called DECOA
(Decomposition-based Outer Approximation). DECOA is based on
decomposing a GOP into sub-problems, which are coupled by linear con-
straints. It computes a solution by alternately solving sub- and master-
problems using Branch-and-Bound (BB). Since DECOA does not use a
single (global) BB-tree, it is called a multi-tree algorithm. After formu-
lating a GOP as a block-separable MINLP, we describe how piecewise
linear Outer Approximations (OA) can be computed by reformulating
nonconvex functions as a Difference of Convex functions. This is followed
by a description of the main- and sub-algorithms of DECOA, including
a decomposition-based heuristic for finding solution candidates. Finally,
we present preliminary results with MINLPs and conclusions.
Keywords: Global optimization · Decomposition method ·

Mixed-integer nonlinear programming
1 Introduction
We consider block-separable (or quasi-separable) MINLP problems of the form
min cT x s. t. x ∈ P, xk ∈ Xk , k ∈ K (1)
with
P := {x ∈ Rn : aTj x ≤ bj , j ∈ J}
Xk := Gk ∩ Lk ∩ Yk , (2)
where
Gk := {y ∈ [xk , xk ] ⊂ Rnk : gkj (y) ≤ 0, j ∈ [mk ]},

Lk := {y ∈ [xk , xk ] ⊂ Rnk : aTkj y ≤ bkj , j ∈ Jk },
Yk := {y ∈ Rnk : yi ∈ Z, i ∈ Ik }. (3)

https://doi.org/10.1007/978-3-030-21803-4_50
Towards Multi-tree Methods for Large-Scale Global Optimization 499
n
vector of variables x ∈ R is partitioned into |K| blocks suchnkthat n =
The
nk , where nk is the dimension of the k -th block, and xk ∈ R denotes
k∈K
the variables of the k-th block. The vectors x, x ∈ Rn denote lower and upper
bounds on the variables.
The linear constraints defining the polytope P are called global. The con-
straints defining sub-sets Xk are called local. Set Xk is defined by nonlinear
local constraints, denoted by Gk , by linear local constraints, denoted by Lk ,
and by integrality constraints, denoted by Yk . In this paper, all the nonlinear
local constraint functions gkj : Rnk → R are assumed to be bounded and con-
tinuously differentiable within the set [xk , xk ]. Linear global constraints P are
defined by aj ∈ Rn , bj ∈ R, j ∈ J and linear local constraints Lk are defined by
akj ∈ Rnk , bkj ∈ R, j ∈ Jk . Set Yk defines the set of integer values of variables
xki , i ∈ Ik
, where Ik is an index set. The linear objective function is defined by
cT x := cTk xk , ck ∈ Rnk and matrix Ak ∈ Rm×nk with m = |J| + |Jk |, is
k∈K
defined by columns with the indices of k-th block. Furthermore, we define sets

G := Gk , Y := Yk , X := Xk . (4)
k∈K k∈K k∈K
Note that it is possible to reformulate a general sparse MINLP defined by fac-

torable functions gkj as a block-separable optimization problem with a given
maximum block-size nk by adding new variables and copy-constraints [5,7,8].
Multi-tree Decomposition Algorithms. Decomposition is a very general approach
that can be applied to convex optimization, as well as non-convex optimization
and discrete optimization. These methods are based on dividing a model into
smaller sub-problems, which can be solved in parallel. The solutions of the sub-
problems are used for updating a global master problem. If the master problem
is a MIP, this type of strategy is called multi-tree, because an individual branch-
and-bound tree is built for each MIP instance. Using one global master which
is updated during the solution process, i.e. new constraints are added during
the solution process in order to improve the master problem, is called single-tree
strategy. More discussion on single-tree and multi-tree approaches can be found
in [3].
2 Polyhedral Outer-Approximation
Fundamental for an OA-solver is a method for computing a polyhedral OA of

feasible set G of problem (1) with an arbitrary precision. An example of an OA
master problem is given by
min cT x,
s.t. k , k ∈ K,
x ∈ P, xk ∈ X (5)
500 P. Muts and I. Nowak
where
k := Yk ∩ Lk ∩ G
k , with Yk := Yk ,
X (6)
Rnk .
The sets G ⊇ Gk and X k ⊇ Xk denote a polyhedral OA of Gk and Xk , respec-

:= G k .
tively. Note that X := Xk and G
k∈K k∈K
2.1 Piecewise DC Outer Approximation

Consider a DC formulation (Difference of Convex functions)
gkj (x) = hkj (x) − qkj (x),
defined by the convexified nonlinear and quadratic functions

hkj (x) := gkj (x) + qkj (x) and qki (x) := σkj ϕki (xi ), (7)
i∈Ikj
∂g
where ϕki (xi ) := x2i and Ikj = {i : ∂xi
= const} denotes an index set of nonlinear
variables of constraint function gkj .
The convexification parameters are computed by σkj = max{0, −vkj } and
vkj is a lower bound of the optimal value of the following nonlinear eigenvalue
problem
min y T Hkj (x) y s. t. x ∈ [xk , xk ], y ∈ Rnk , y 2 = 1, (8)
with Hkj = ∇2 gkj .

A convex polyhedral underestimator of hkj is defined by
ȟkj (x) = max h̄kj,ŷ (x), (9)

ŷ∈Tk
where
h̄kj (x, ŷ) := hkj (ŷ) + ∇hkj (ŷ)T (x − ŷ)
denotes the linearization of hkj at the sample point ŷ ∈ Tk ⊂ Rnk .
A piecewise linear overestimator q̌ki (x) of qkj is defined by replacing ϕki by
pki,t+1 − xi xi − pki,t
ϕ̌ki (xi ) := ϕki (pki,t ) + ϕki (pki,t+1 ) , (10)
pki,t+1 − ŷki,t pki,t+1 − pki,t
where xi ∈ [pki,t , pki,t+1 ], t ∈ {1, . . . , |Bki | − 1}, regarding breakpoints Bki :=

{pki,1 , . . . , pki,|Bki | }. Then a DC polyhedral underestimator ǧkj of gkj is given
by
ǧkj (x) := ȟkj (x) − q̌kj (x). (11)
A DC-based OA of G is denoted by
k = {xk ∈ Lk : (xk , rk ) ∈ C
G k ∩ Q
k }, (12)
where

k := {y ∈ [x , xk ], rk ∈ Rnk : ȟkj (y) − σkj
C rki ≤ 0},
k
i∈Ikj
k := {y ∈ [x , xk ], rk ∈ Rnk : rk − ϕ̌k (y) ≤ 0}.

Q k
The polytope Ck is defined by linearization cuts as in (9) and the set Q
k is
defined by breakpoints Bk as in (10).
3 DECOA
In this section, we describe the DECOA (Decomposition-based Outer Approxi-
mation) algorithm for solving (1), depicted in Algorithm 1.
Algorithm 1. Main algorithm of DECOA

1: function OaSolve
2: L, B) ← initOa
(x̂, C,
3: v←∞
4: repeat
5: B) ← oaLocalSearch(x̂, B)
(x̃, C,
6: B) ← fixAndRefine(x̃)
(C,
7: if x̃ ∈ X and cT x̃ < v then
8: x∗ ← x̃, v ← cT x∗
9: if v − cT x̂ < then return (x̂, x∗ )
10: ←tightenBounds(x∗ )
(x, x, C)
11: x̂ ←solveOa(C, B)
12: for k ∈ K do ŷk ←project(x̂k )
13: B) ← addCutsAndPoints(x̂, ŷ, B)
(C,
14: until v − cT x̂ <
15: return (x̂, x∗ )
breakpoints B, and L, a
It starts by computing an initial OA, defined by C,
solution estimate x̂, and sets the upper objective bound v. Then it computes a
solution candidate x̃ by calling procedure oaLocalSearch. Using x̃, the outer
approximation C is improved by calling procedure fixAndrefine. If solution
point x̃ of problem (21) improves the best solution candidate, i.e. x̃ ∈ X and
cT x̃ < v, point x̃ is the new solution candidate of problem (1), denoted by x∗ .
Moreover, we update v to cT x∗ . The OA master problem (5) is solved by call-
ing procedure solveOA. Furthermore, in order to refine the OA, the following
projection sub-problem is solved for k ∈ K
ŷk = argmin xk − x̂k 2 s. t. xk ∈ Gk , xki = x̂ki , i ∈ Ik , (13)
by calling the procedure project(x̂k ). The points x̂k and ŷk are used by method
addCutsAndPoints for cut and breakpoint generation. The algorithm itera-
tively performs these steps until a stopping criterion is fulfilled.
3.1 Cut and Breakpoint Generation

The procedure addCutsAndPoints(x̂, ŷ, B) adds linearzation cuts to C and
breakpoints to B. It uses the procedure addActiveLinCut(ŷk ) for adding lin-
earization cuts (9) at ŷk for all constraints gkj , which are active at ŷk and are
violated at x̂k .
Let [pki,t , pki,t+1 ] be the breakpoint interval containing x̂ki and define

[ŷki , pki,t+1 ], if ŷki < x̂ki
[xki , xki ] := (14)
[pki,t , ŷki ], if x̂ki > ŷki .
Then procedure addNonconvexLinCutsAndPoints(x̂k , ŷk ) is called,

which for some nonconvex constraints gkj adds breakpoints, e.g. like in AMP
algorithm [4], and cuts at the new breakpoints.
Algorithm 2. Cut and breakpoint generation

1: function addCutsAndPoints(x̂, ŷ, B)
2: for k ∈ K do
3: Ck ← addActiveLinCut(ŷk )
4: (Ck , Bk ) ← addNonconvexLinCutsAndPoints(x̂k , ŷk )
5: return (C, B)
3.2 OA-Start Heuristic

Algorithm 3 describes the procedure InitOa for computing an initial OA.
Algorithm 3. OA-Initialization
1: function initOa
2: for k ∈ K do
3: for dk ∈ {ck , 1, −1} do
4: k , Sk ) ← oaSubSolve(dk )
(x̂k , C
5: Lk ← {xk ∈ Lk : dTk x̂k ≤ dTk xk }
6: [xk , xk ] ← box(Sk )
7: Bk ← {xk , xk , xk , xk }
8: ←addRnlpCuts(x , x )
(x̂, C)
9: return (x̂, C, L, B)
It uses the procedure oaSubSolve for (approximately) solving sub-problems
x̂k = argmin dTk xk s. t. xk ∈ Xk , (15)
where dk ∈ Rnk is a search direction.

Note that 1 denotes a vector of ones and box(S) denotes the smallest interval
[x , x ] containing a set S. The procedure addRnlpCuts(x , x ) performs cutting
plane iterations for solving the RNLP-OA
(ỹ, s) = argmin cT x + γ s 1
s. t. Ax ≤ b + s, x ∈ [x , x ], s≥0 (16)
hkj (xk ) − q̌kj (xk ) ≤ 0, j ∈ Jk , k ∈ K,
where
xki − xki xki − xki
q̌kj (xk ) = σkj ϕki (xki ) + ϕ ki (x
ki ) . (17)
xki − xki xki − xki
i∈[nk ]
Furthermore, adjacentPoints(x̂, B) returns the smallest breakpoint interval

containing x̂.
3.3 Solving OA Sub-problems
Algorithm 4 describes the procedure oaSubSolve(dk ) for solving sub-problem

(15). It uses the procedure solveSubOa for solving the local OA master-problem
min dTk xk k .
s. t. xk ∈ X (18)
Furthermore, it uses solveFixedNlp(x̂k ) for solving the following local NLP

problem with fixed integer variables and starting point x̂k :
min cTk xk s. t. xk ∈ Lk ∩ Gk , xki = x̂ki , i ∈ Ik .
Note that Algorithm 4 uses temporary breakpoints Bk , which are initialized
using initLocalBreakPoints.
Algorithm 4. OA sub-solver
1: function oaSubSolve(dk )
2: x̂k ←solveSubOa(C k )

3: Bk ← initLocalBreakPoints
4: repeat
5: ŷk ← project(x̂k )
6: (Ck , Bk ) ← addCutsAndPoints(x̂k , ŷk , Bk )
7: x̂k ←solveSubOa(C k )
8: until stopping criterion
9: x∗k ← solveFixedNlp(x̂k )
10: Sk ← Sk ∪ {x∗k }
11: return (x̂k , C k , Sk )
3.4 Fix-and-Refine
The procedure fixAndRefine, described in Algorithm 5, generates cuts and
breakpoints per block by solving a partly-fixed sub-problem similarly as in
Algorithm 4. It uses the procedure solveFixOA for solving a MIP-OA problem,
where variables are fixed for all blocks except for one:
min cTk xk + γ s 1

s.t. Ak xk ≤ s + b − Am x̃m ,
(19)
m∈K\{k}
k ,
xk ∈ X s ≥ 0,
where x̃ is a solution candidate of (1).
Algorithm 5. Fixation-based cut and breakpoint generation

1: function FixAndRefine(x̃, B)
2: for k ∈ K do
3: x̂k ← solveFixOA(x̃k , C k , Bk )
4:
(Ck , Bk ) ← addCutsAndPoints(x̂k , x̃k , Bk )
5: repeat
6: x̂k ← solveFixOA(x̃k , C k , Bk )
7: ŷk ← project(x̂k )
8: (Ck , Bk ) ← addCutsAndPoints(x̂k , ŷk , Bk )
10: return (C, B)
3.5 OA-Based Local Search

Algorithm 6 describes the decomposition-based procedure oaLocalSearch for
computing a solution candidate x̃ ∈ X ∩ P of problem (1). It iteratively solves
the following restricted MIP-master-problem:
x̂ = argmin cT x ∩ [x , x ].
s. t. x ∈ P ∩ G (20)
regarding target bounds x , x , projects the point x̂ ∈ P onto X and adds cuts
and breakpoints.
Finally, in order to compute solution candidate x̃ ∈ X ∩ P , Algorithm 6 calls
procedure solveFixedNlp(ŷ) for solving the following NLP master problem
with fixed integer variables:
min cT x,
s.t. x ∈ P ∩ X,
xki = x̂ki , i ∈ Ik , k ∈ K. (21)
Note that the algorithm uses temporary breakpoints B without changing break-
points B.
Algorithm 6. OA-based local search

1: function oaLocalSearch(x̂, B)
2: (x , x ) ← adjacentPoints(x̂, B)
3: x̂) ←addRnlpCuts(x , x )
(C,
4: repeat
5: for k ∈ K do
6: ŷk ←project(x̂k )
7: Bk ← {xk , xk }
8: B ) ← addCutsAndPoints(x̂, ŷ, B )
(C,
9: x̂ ←solveOa(C, B)
11: x̃ ← solveFixedNlp(x̂)
12: B ← B ∪ B
13: return (x̃, C, B)
3.6 Bound Tightening
The method tightenBounds(x∗ ) performs a similar Optimization-Based

Bound Tightening (OBBT) strategy as proposed in [4]. It is based on minimizing
: cT x ≤ cT x∗ } using
or maximizing some of the variables xki over the set {x ∈ X
a similar approach as in Algorithm 4.
4 Numerical Experiments Using Decogo
Algorithm 1 is currently implemented as part of the parallel MINLP-solver

Decogo (DECOmposition-based Global Optimizer) [6]. It uses Pyomo [2], an
algebraic modelling language in Python, and two subsolvers: SCIP 5.0 [1] for
solving MIP problems and IPOPT 3.12.8 [10] for solving LP and NLP problems.
Fig. 1. Number of MIP solutions per problem size for convex MINLPs
A preliminary version of Algorithm 1 has been tested on 70 convex MINLP

instances from MINLPLib [9] with 11 to 2720 variables. The results show that the
number of MIP solutions (procedure solveOa of Algorithm 1) is independent
of problem size. Figure 1 presents this property of the algorithm. The average
number of MIP solutions is 2.37.
5 Conclusions
We introduced DECOA, a multi-tree decomposition-based method for solving

large scale MINLP models (1), based on a DC-approach for computing a poly-
hedral OA. Preliminary numerical experiments with Decogo show that the pre-
sented OA-method solves convex MINLPs with a small number of MIP solu-
tions. Many ideas of the presented methods are new, and there is much room
for improvement.
References
1. Gleixner, A., Eifler, L., Gally, T., Gamrath, G., Gemander, P., Gottwald, R.L.,
Hendel, G., Hojny, C., Koch, T., Miltenberger, M., Müller, B., Pfetsch, M.E.,
Puchert, C., Rehfeldt, D., Schlösser, F., Serrano, F., Shinano, Y., Viernickel,
J.M., Vigerske, S., Weninger, D., Witt, J.T., Witzig, J.: The SCIP Optimiza-
tion Suite 5.0. Technical report, www.optimization-online.org/DB HTML/2017/
12/6385.html (2017)
2. Hart, W.E., Laird, C.D., Watson, J.P., Woodruff, D.L., Hackebeil, G.A., Nicholson,
B.L., Siirola., J.D.: Pyomo–optimization modeling in python, vol. 67, second edn.
Springer Science & Business Media, Heidelberg (2017)
3. Lundell, A., Kronqvist, J., Westerlund, T.: The supporting hyperplane optimiza-
tion toolkit. www.optimization-online.org/DB HTML/2018/06/6680.html (2018)
4. Nagarajan, H., Lu, M., Wang, S., Bent, R., Sundar, K.: An adaptive, multivariate
partitioning algorithm for global optimization of nonconvex programs. J. Global
Optim. (2019)
5. Nowak, I.: Relaxation and Decomposition Methods for Mixed Integer Nonlinear
Programming. Birkhäuser (2005)
6. Nowak, I., Breitfeld, N., Hendrix, E.M.T., Njacheun-Njanzoua, G.: Decomposition-
based inner- and outer-refinement algorithms for global optimization. J. Global
Optim. 72(2), 305–321 (2018)
7. Tawarmalani, M., Sahinidis, N.: A polyhedral branch-and-cut approach to global
optimization. Math. Program. 225–249 (2005)
8. Vigerske, S.: Decomposition in multistage stochastic programming and a constraint
integer programming approach to mixed-integer nonlinear programming. Ph.D.
thesis, Humboldt-Universität zu Berlin (2012)
9. Vigerske, S.: MINLPLib. http://minlplib.org/index.html (2018)
10. Wächter, A., Lorenz, B.T.: On the implementation of an interior-point filter line-
search algorithm for large-scale nonlinear programming. Math. Program. 106(1),
25–57 (2006)
Optimization under Uncertainty
Fuzzy Pareto Solutions in Fully Fuzzy
Multiobjective Linear Programming
Manuel Arana-Jiménez(B)
Department of Statistics and Operational Research, University of Cádiz,

Cadiz, Spain
manuel.arana@uca.es
Abstract. In this work, it is proposed a new method for obtaining

Pareto solutions of a fully fuzzy multiobjective linear programming prob-
lem with fuzzy partial orders and triangular fuzzy numbers, without
ranking functions, by means of solving a crisp multiobjective linear prob-
lem. It is provided an algorithm to generate Pareto solutions.
Keywords: Multiobjective optimization ·

Fully fuzzy linear programming · Fuzzy numbers
1 Introduction
Fuzzy linear programming is a field where many researchers model decision mak-
ing in fuzzy environment [3,8,11,15,17,31,32]. It is usual that not all variables
and parameters in the fuzzy linear problem are assumed to be fuzzy numbers,
although it is interesting to provide a general model for linear problems where
all elements are fuzzy, called fully fuzzy linear programming problem ((FFLP)
problem, for short). In this regard, Lofti et al. [30] proposed a method to find the
fuzzy optimal solution of (FFLP) with equality constraints with symmetric fuzzy
numbers. Kumar et al. [26] proposed a new method for finding the fuzzy optimal
solutions of (FFLP) problems with equality constraints, using ranking function
(see [3] and the bibliography there in). Najafi and Edalatpanah [35] made cor-
rection to the previous method. Khan et al. [24] studied (FFLP) problems with
inequalities, and they also use ranking functions to compare the objective func-
tion values (see also [10,25]). Ezzati et al. [16] recovered the methods provided
by Lofti et al. [30] and Kumar et al. [26] to propose a new method based on
a multiobjective programming problem with equality constraints. Liu and Gao
[29] have remarked some limitations of the existing method to solve (FFLP)
problems. As applications, Chakraborty et al. [12] locate fuzzy optimal solutions
in fuzzy transportation problems. Recently, Arana-Jiménez [5] have provided a
novel method to find fuzzy optimal (nondominated) solutions of (FFLP) prob-
lems with inequality constraints, with triangular fuzzy numbers and not neces-
sarily symmetric, via solving a crisp multiobjective linear programming problem.
This method does not require ranking functions.
Supported by the research project MTM2017-89577-P (MINECO, Spain) and UCA.
https://doi.org/10.1007/978-3-030-21803-4_51
510 M. Arana-Jiménez
On the other hand, some models require decision maker to face several objec-
tives at the same time. This type of problems includes multiobjective program-
ming problems, where two or more objectives have to be optimized (minimized
or maximized), and we deal with conflicts among the objectives. The Pareto
optimality in multiobjective programming associates the concept of a solution
with some property that seems intuitively natural, and is an important concept
in mathematical models, economics, engineering, decision theory, optimal con-
trol, among others (see [2]). So, extending the idea of fuzzy linear programming
to Fuzzy multiobjective linear programming, again the objectives appearing in it
are conflicting in nature. Therefore, a concept of Pareto solution is necessary too.
In such fuzzy multiobjective problems, Bharati et al. [9] comment that choice of
best alternatives among the available need to rank the fuzzy numbers used in the
model. They compare different methods using ranking functions, and propose
the concept of Pareto-optimal solution suggested by Jimenez and Bilbao [22] by
means of ranking function. Some applications can be found by Kumar et al. [27],
for instance, to DEA. In the present work, and as an extension of [5], we face the
challenge of studying a problem with fuzzy variables and parameters, that is, a
fully fuzzy multiobjective linear programming problem ((FFMLP) problem, for
short). In this regard, a new method is proposed to get fuzzy Pareto solutions,
and no ranking functions are used.
The structure is as follows. In next section, we present notations, arithmetic
and partial orders on fuzzy numbers. Later, in Sect. 3, we formulate the fully
fuzzy multiobjective linear programming problem and provide an algorithm to
generate fuzzy Pareto solutions of (FFMLP) by means of solving an auxiliary
crisp multiobjective programming problem. Finally, we conclude the paper and
present future works. Due to length requirements on this text for the congress,
proofs and examples are omitted and will be presented in a paper (extended
version).
2 Preliminaries on Arithmetic and Partial Order on

Fuzzy Numbers
A fuzzy set on Rn is a mapping u : Rn → [0, 1]. Each fuzzy set u has associated
a family of α-level sets, which are described as [u]α = {x ∈ Rn | u(x) ≥ α}
for any α ∈ (0, 1], and its support as supp(u) = {x ∈ Rn | u(x) > 0}. The
0-level of u is defined as the closure of supp(u), that is, [u]0 = cl(supp(u)). A
very useful type of fuzzy set to model parameters and variables are the fuzzy
numbers. Following Dubois and Prade [13,14], a fuzzy set u on R is said to be
a fuzzy number if u is normal, this is there exists x0 ∈ R such that u(x0 ) = 1,
upper semi-continuous function, convex, and (iv) [u]0 is compact. FC denotes the
family of all fuzzy numbers. The α-levels of a fuzzy number can be represented
by means of real interval, that is, [u]α = [uα , uα ] ∈ KC , uα , uα ∈ R, with KC
is the set of real compact intervals. There exist many families of fuzzy numbers
that have been applied to model uncertainty in different situations. some of the
Fuzzy Pareto Solutions in Fully Fuzzy Multiobjective Linear Programming 511
most popular are the L-R, triangular, trapezoidal, polygonal, gaussian, quasi-
quadric, exponential, and singleton fuzzy numbers. The reader is referred to
[7,21,36] for a complete description of these families and their representation
properties. Among them, we point out triangular fuzzy numbers, because of
their easy modeling and interpretation (see, for instance, [13,23,24,30,36]), and
whose definition is as follows.
Definition 1. A fuzzy number ã = (a− , â, a+ ) is said to be a triangular fuzzy

number (TFN for short) if its membership function is given by
⎧
⎪ x−a−
⎨ â−a− , if a− ≤ x ≤ â,
+
ã(x) = a+ −x , if â < x ≤ a+ ,
⎪ a −â
⎩ 0, otherwise.
At the same time, given a triangular fuzzy number ã = (a− , â, a+ ), its α-
levels are formulated as
[ã]α = [a− + (â − a− )α, a+ − (a+ − â)α],
for all α ∈ [0, 1]. This means that triangular fuzzy number are well determined
by three real numbers a− ≤ â ≤ a+ . A unique triangular fuzzy number is char-
acterized by means of the previous formulation of α-levels, such as Goestschel
and Voxman [19] established. The set of all TFNs is denoted as TF .
The nonnegativity condition on some parameters and variables in many opti-
mization problems makes useful the following special consideration of TFNs. Let
ã be a fuzzy number. We say that ã is nonnegative fuzzy number (nonpositive,
respectively) if ã0 ≥ 0 (ã0 ≤ 0, respectively). So, in the case that ã is a TFN,
then ã nonnegative (nonpositive, respectively) if and only if a− ≥ 0 (a+ ≤ 0,
respectively).
Classical arithmetic operations on intervals are well known, and can be
referred to Moore [33,34] and Alefeld and Herzberger [1]. A natural extension of
these arithmetic operations to fuzzy numbers u, v ∈ FC can be found described
in [18,28], where the membership function of the operation u ∗v, with ∗ ∈ {+, ·},
is defined by
(u ∗ v)(z) = sup min{u(x), v(y)}. (1)
z=x∗y
Furthermore, the previous arithmetic operations can be provided by means

of their α-levels as follows (see, [Theorem 2.6, [18]). For any α ∈ [0, 1]:
[u + v]α = [uα + v α , uα + v α ] , (2)

α
[λu] = [min{λuα , λuα }, max{λuα , λuα }] , (3)
[uv]α = [u]α × [v]α = [min {uα v α , uα v, uα v α , uα v α }, max {uα v α , uα v, uα v α , uα v α }]. (4)
TF is closed under addition and multiplication by scalar. The above opera-

tions (2) and (3) are straightforward particularized to triangular fuzzy number
as follows. Given ã = (a− , â, a+ ), b̃ = (b− , b̂, b+ ) ∈ TF and λ ∈ R, then
ã + b̃ = (a− + b− , â + b̂, a+ + b+ ), (5)

(λa− , λâ, λa+ ) if λ ≥ 0,
λã = (6)
(λa+ , λâ, λa− ) if λ < 0.
However, TF is not closed under the multiplication operation (4) (see, for
instance, the examples in [39]). To avoid this situation, it is usual to apply a
different multiplication operation between TFNs, such as those referenced in
[5,23,24,26], which can be considered as an approximation to the multiplication
given in (1). We provide the following definition for the multiplication:

ãb̃ = ((ãb̃)− , (ãb̃), (ãb̃)+ )
= (min{a− b− , a− b+ , a+ b− , a+ b+ }, âb̂, max{a− b− , a− b+ , a+ b− , a+ b+ }). (7)
In the case that ã or b̃ is a nonnegative TFN, then the previous multiplication

is reduced (see, for instance, [23,26]). For instance, if b̃ is nonnegative, then
⎧ − −
⎨ (a b , âb̂, a+ b+ ), if a− ≥ 0,
ãb̃ = (a− b+ , âb̂, a+ b+ ), if a− < 0, a+ ≥ 0, (8)
⎩ − +
(a b , âb̂, a+ b− ), if a+ < 0.
And if ã and b̃ are nonnegative, then
ãb̃ = (a− b− , âb̂, a+ b+ ). (9)

To compare two fuzzy numbers, there exist several definitions based on inter-
val binary relations (see e.g., [20]) which provides partial orders in fuzzy sets (see,
e.g., [37,38]).
Definition 2. Given u, v ∈ FC , it is said that:
(i) μ ≺ ν if and only if μα < ν α and μα < ν α , for all α ∈ [0, 1],
(ii) μ ν if and only if μα ≤ ν α and μα ≤ ν α , for all α ∈ [0, 1],
In a similar way, we define , . In case of TFNs, the previous definition can be

really reduced, as recently Arana-Jiménez and Blanco [6] have proved:
Theorem 1. Given ã = (a− , â, a+ ), b̃ = (b− , b̂, b+ ) ∈ TF , then:
(i) ã ≺ b̃ if and only if a− < b− , â < b̂ and a+ < b+ .

(ii) ã b̃ if and only if a− ≤ b− , â ≤ b̂ and a+ ≤ b+ .
The relations , are obtained in a similar manner. Note that to say that ã is
nonnegative is equivalent to write ã 0̃ = (0, 0, 0).
3 Fully Fuzzy Multiobjective Linear Problem

Consider a fuzzy vector z̃ = (z̃1 , . . . , z̃p ) ∈ TF × · · · × TF = (TF )p , with p ∈
N. For the sake of simplicity, we write z̃ = (z̃i )pi=1 . In a same manner, x =
(x− −
1 , x̂1 , x1 , . . . , xn , x̂n , xn ) ∈ R
+ + 3n
can be written as x = (x− + n
j , x̂j , xj )j=1 , and
so on. Let us define the following formulation of a Fully Fuzzy Multiobjective
Linear Problem:
⎛ ⎞p

n
(FFMLP) Minimize z̃ = (z̃i )pi=1 = ⎝ c̃ij x̃j ⎠
j=1
i=1
n

subject to ãrj x̃j b̃r , r = 1, . . . , m,

j=1
x̃j 0̃, j = 1, . . . , n,
where z̃ is the fuzzy vector objective function, each c̃i = (c̃1 , . . . , cñ ) ∈ (TF )n is
the fuzzy vector with the coefficients of the ith component of the fuzzy vector
function, x̃ = (x̃1 , . . . , x̃n ) is the fuzzy vector with the fuzzy decision variables,
and ãrj and b̃r are the fuzzy technical coefficients. Since we deal with (FFMLP)
without any kind of ranking function, it is necessary to define a fuzzy nondomi-
nated solution concept, as follows.
Definition 3. Let x̄
˜ be a feasible solution for (FFMLP). x̄ ˜ is said to be a fuzzy
Pareto solution of (FFMLP) if there does not exist a feasible solution x̃ for

n n

n

(FFMLP) such that c̃ij x̃j c̃ij x̄

˜j for all i = 1, . . . , p, with c̃i0 j x̃j =
j=1 j=1 j=1
n
c̃i0 j x̄
˜j for some i0 ∈ {1, . . . , p}.
j=1
Following the notation of TFNs, we have:
z̃i = (zi− , ẑi , zi+ ), i = 1, . . . , p,

x̃j = (x− j , x̂j , xj ),
+
j = 1, . . . , n,
−
c̃ij = (cij , ĉij , c+
ij ), i = 1, . . . , p, j = 1, . . . , n,
−
ãrj = (arj , ârj , arj ), r = 1, . . . , m, j = 1, . . . , n,
+
b̃r = (b−r , b̂r , br ),

+
r = 1, . . . , m.
Let us remark that x̃j is a nonnegative TFN, and so the multiplication role is
given by (8). This means that c̃ij x̃j is computed by one of the three expressions
in (8), which only depends on c̃ij . Since the fuzzy coefficients c̃ij are known, then

the expressions of c̃ij x̃j = ((c̃ij x̃j )− , (c̃ij x̃j ), (c̃ij x̃j ) ) are also known. The same
+
occurs to ãrj x̃j .

Problem (FFMLP) has associated the following crisp multiobjective problem:
n

n

n

(CMLP) Minimize f (x) = ( (c̃ij x̃j )− ,

(c̃ij x̃j ), (c̃ij x̃j )+ )pi=1
j=1 j=1 j=1
n

subject to (ãrj x̃j )− ≤ b−

r , r = 1, . . . , m,
j=1

n

(ãrj x̃j ) ≤ b̂r , r = 1, . . . , m,
j=1

n
(ãrj x̃j )+ ≤ b+
r , r = 1, . . . , m,
j=1
x−
j − x̂j ≤ 0, j = 1, . . . , n,
x̂j − x+
j ≤ 0, j = 1, . . . , n,
x−j ≥ 0, x̂j ≥ 0, x+j ≥ 0, j = 1, . . . , n.
f : R3n → R3p is a vector function, with the variable x = (x− + n
j , x̂j , xj )j=1 ∈ R ,
3n
with fh linear functions, h = 1, . . . , 3p. And since all constraints are represented
as linear inequalities on the variable x, then (CMLP) is a multiobjective linear
programming problem. Recall that a feasible point x̄ ∈ R3n of (CMLP) is said
to be a Pareto solution if there does not exist another feasible point x such
that fh (x̄) fh (x), for all h = 1, . . . , 3p, and fh0 (x̄) < fh0 (x), for some h0 ∈
{1, . . . , 3p}. The relationship between the fuzzy Pareto solutions of (FFMLP)
and the Pareto solutions of (CMLP) is as follows.
Theorem 2. x̃ = (x̃1 , . . . , x̃n ) with x̃j = (x− j , x̂j , xj )
+
∈ TF , j =
1, . . . , n, is a fuzzy Pareto solution of (FFMLP) if and only if x =
(x− + −
1 , x̂1 , x1 , . . . , xn , x̂n , xn ) ∈ R
+ 3n
is a Pareto solution of (CMLP).
In the literature, we can find several methods to generate Pareto solutions
of a multiobjective linear problem (see [2] and the bibliography therein). Most
popular methods are based on scalarization. One of them is by means of related
weighting problems, whose formulation can be as follows. Given (CMLP) and
3p
w = (w1 , . . . , w3p ) ∈ R3p , wi > 0, i=1 wi = 1, we define the related weighting
problem as

3p
(CMLP)w Minimize wi fi (x)
i=1

n
subject to (ãrj x̃j )− ≤ b−
r , r = 1, . . . , m,
j=1

n

(ãrj x̃j ) ≤ b̂r , r = 1, . . . , m,
j=1

n
(ãrj x̃j )+ ≤ b+
r , r = 1, . . . , m,
j=1
x−
j − x̂j ≤ 0, j = 1, . . . , n,
x̂j − x+
j ≤ 0, j = 1, . . . , n,
−
xj ≥ 0, x̂j ≥ 0, x+j ≥ 0, j = 1, . . . , n.
3p
Theorem 3. Given w = (w1 , . . . , w3p ) ∈ R3p , wi > 0, i=1 wi = 1, if
− + n
x = (xj , x̂j , xj )j=1 ∈ R is an optimal solution of the weighting optimiza-
3n
tion problem (CMLP)w , then x̃ = (x̃1 , . . . , x̃n ) with x̃j = (x−

j , x̂j , xj ) ∈ TF ,
+
j = 1, . . . , n, is a fuzzy Pareto solution of (FFMLP).
The previous result allows us to outline a method to get fuzzy Pareto solu-
tions for (FFMLP) problem, which can be written via the following algorithm.
Algorithm
Step 1 Define r ∈ N and a set of weights

SW = {ws = (ws1 , . . . , ws3p ) ∈ R3p : s = 1, . . . k},
3p
with wsi > 0, for all i, and i=1 wsi = 1, for all s = 1, . . . k
D←∅
s←1
Step 2 Solve (MLP)ws → xs = (x− + −
s,1 , x̂s,1 , xs,1 , . . . , xs,n , x̂s,n , xs,n ) ∈ R
+ 3n
If no solution, then go to Step 4

Step 3 x̃s,j ← (x−s,j , x̂s,j , xs,j ), j = 1, . . . n
+
x̃s ← (x̃s,1 , x̃s,2 , . . . x̃s,n )

D ← D ∪ {x̃s }
Step 4 s←s+1
If s ≤ k, then go to Step 2
Step 5 End
Given k ∈ N and a set of weights SW , and as a result of the application of the

previous algorithm, it is obtained a set D of fuzzy Pareto solutions for (FFMLP)
problem.
4 Conclusions
An equivalence between a (FFMLP) problem and a crisp multiobjective lineal

programming problem is established, without loss of information and without
ranking functions. As a result, an algorithm to obtain fuzzy Pareto solutions for
a (FFMLP) problem has been provided.
As future works, the techniques presented will be extended to get fuzzy
Pareto solutions in interval and fuzzy fractional programming with applications
to economy, among others.
References
1. Alefeld, G., Herzberger, J.: Introduction to Interval Computations. Academic
Press, New York (1983)
2. Arana-Jiménez, M. (ed.): Optimiality Conditions in Vector Optimization. Bentham
Science Publishers Ltd, Bussum (2010)
3. Arana-Jiménez, M., Rufián-Lizana, A., Chalco-Cano, Y., Román-Flores, H.: Gen-

eralized convexity in fuzzy vector optimization through a linear ordering. Inf. Sci.
312, 13–24 (2015)
4. Arana-Jiménez, M., Antczak, T.: The minimal criterion for the equivalence between
local and global optimal solutions in nondifferentiable optimization problem. Math.
Meth. Appl. Sci. 40, 6556–6564 (2017)
5. Arana-Jiménez, M.: Nondominated solutions in a fully fuzzy linear programming
problem. Math. Meth. Appl. Sci. 41, 7421–7430 (2018)
6. Arana-Jiménez, M., Blanco, V.: On a fully fuzzy framework for minimax mixed
integer linear programming. Comput. Ind. Eng. 128, 170–179 (2019)
7. Báez-Sánchez, A.D., Moretti, A.C., Rojas-Medar, M.A.: On polygonal fuzzy sets
and numbers. Fuzzy Sets Syst. 209, 54–65 (2012)
8. Bellman, R.E., Zadeh, L.A.: Decision making in a fuzzy environment. Manag. Sci.
17, 141–164 (1970)
9. Bharati, S.K., Abhishek, S.R., Singh: A computational algorithm for the solution
of fully fuzzy multi-objective linear programming problem. Int. J. Dynam. Control.
https://doi.org/10.1007/s40435-017-0355-1
10. Bhardwaj, B., Kumar, A.: A note on the paper a simplified novel technique for
solving fully fuzzy linear programming problems. J. Optim. Theory Appl. 163,
685–696 (2014)
11. Campos, L., Verdegay, J.L.: Linear programming problems and ranking of fuzzy
numbers. Fuzzy Set. Syst. 32, 1–11 (1989)
12. Chakraborty, D., Jana, D.K., Roy, T.K.: A new approach to solve fully fuzzy
transportation problem using triangular fuzzy number. Int. J. Oper. Res. 26, 153–
179 (2016)
13. Dubois, D., Prade, H.: Operations on fuzzy numbers. Ins. J. Syst. Sci. 9, 613–626
(1978)
14. Dubois, D., Prade, H.: Fuzzy Sets and Systems: Theory and Applications. Aca-
demic Press, New York (1980)
15. Ebrahimnejad, A., Nasseri, S.H., Lotfi, F.H., Soltanifar, M.: A primal-dual method
for linear programming problems with fuzzy variables. Eur. J. Ind. Eng. 4, 189–209
(2010)
16. Ezzati, R., Khorram, E., Enayati, R.: A new algorithm to solve fully fuzzy linear
programming problems using the MOLP problem. Appl. Math. Model. 39, 3183–
3193 (2015)
17. Ganesan, K., Veeramani, P.: Fuzzy linear programs with trapezoidal fuzzy num-
bers. Ann. Oper. Res. 143, 305–315 (2006)
18. Ghaznavi, M., Soleimani, F., Hoseinpoor, N.: Parametric analysis in fuzzy number
linear programming problems. Int. J. Fuzzy Syst. 18(3), 463–477 (2016)
19. Goestschel, R., Voxman, W.: Elementary fuzzy calculus. Fuzzy Sets Syst. 18, 31–43
(1986)
20. Guerra, M.L., Stefanini, L.: A comparison index for interval based on generalized
Hukuhara difference. Soft. Comput. 16, 1931–1943 (2012)
21. Hanss, M.: Applied Fuzzy Arithmetic Springer, Stuttgart (2005)
22. Jimenez, M., Bilbao, A.: Pareto-optimal solutions in fuzzy multiobjective linear
programming. Fuzzy Sets Syst. 160, 2714–2721 (2009)
23. Kaufmann, A., Gupta, M.M.: Introduction to Fuzzy Arithmetic Theory and Appli-
cations. Van Nostrand Reinhold, New York (1985)
24. Khan, I.U., Ahmad, T., Maan, N.: A simplified novel technique for solving fully
fuzzy linear programming problems. J. Optim. Theory Appl. 159, 536–546 (2013)
25. Khan, I.U., Ahmad, T., Maan, N.: A reply to a note on the paper “A simplified
novel technique for solving fully fuzzy linear programming problems”. J. Optim.
Theory Appl. 173, 353–356 (2017)
26. Kumar, A., Kaur, J., Singh, P.: A new method for solving fully fuzzy linear pro-
gramming problems. Appl. Math. Model. 35, 817–823 (2011)
27. Mehlawat, M.K., Kumar, A., Yadav, S., Chen, W.: Data envelopment analysis
based fuzzy multi-objective portfolio selection model involving higher moments.
Inf. Sci. 460, 128–150 (2018)
28. Liu, B.: Uncertainty Theory. Springer-Verlag, Heidelberg (2015)
29. Liu, Q., Gao, X.: Fully fuzzy linear programming problem with triangular fuzzy
numbers. J. Comput. Theor. Nanosci. 13, 4036–4041 (2016)
30. Lotfi, F.H., Allahviranloo, T., Jondabeha, M.A., Alizadeh, L.: Solving a fully fuzzy
linear programming using lexicography method and fuzzy approximate solution.
Appl. Math. Modell. 3, 3151–3156 (2009)
31. Maleki, H.R., Tata, M., Mashinchi, M.: Linear programming with fuzzy variables.
Fuzzy Set. Syst. 109, 21–33 (2000)
32. Maleki, H.R.: Ranking functions and their applications to fuzzy linear program-
ming. Far East J. Math. Sci. 4, 283–301 (2002)
33. Moore, R.E.: Interval Analysis. Prentice-Hall, Englewood Cliffs (1966)
34. Moore, R.E.: Method and Applications of Interval Analysis. SIAM, Philadelphia
(1979)
35. Najafi, H.S., Edalatpanah, S.A.: A note on ”A new method for solving fully fuzzy
linear programming problems”. Appl. Math. Model. 37, 7865–7867 (2013)
36. Stefanini, L., Sorini, L., Guerra, M.L.: Parametric representation of fuzzy numbers
and application to fuzzy calculus. Fuzzy Sets Syst. 157(18), 2423–2455 (2006)
37. Stefanini, L., Arana-Jiménez, M.: Karush-Kuhn–Tucker conditions for interval and
fuzzy optimization in several variables under total and directional generalized dif-
ferentiability. Fuzzy Sets Syst. 262, 1–34 (2019)
38. Wu, H.C.: The optimality conditions for optimization problems with convex con-
straints and multiple fuzzy-valued objective functions. Fuzzy Optim. Decis. Making
8, 295–321 (2009)
39. Yasin Ali Md., Sultana, A., Khodadad Kha, A.F.M.: Comparison of fuzzy multipli-
cation operation on triangular fuzzy number. IOSR J. Math. 12(4), 35–41 (2016)
Minimax Inequalities and Variational
Equations
Maria Isabel Berenguer(B) , Domingo Gámez , A. I. Garralda–Guillem ,

and M. Ruiz Galán
Department of Applied Mathematics, University of Granada,

E.T.S. Ingenierı́a de Edificacióón, Granada, Spain
{maribel,domingo,agarral,mruizg}@ugr.es
Abstract. In this paper we study some weak conditions guaranteeing

the validity of several minimax inequalities and illustrate the possibilities
of such a tool for characterizing the existence of solutions of certain
variational equations.
Keywords: Minimax inequalities · Variational equations
1 Introduction
Minimax inequalities are normally associated with game theory. This was the
original motivation of von Neumann work, in 1928, but in mathematical lit-
erature, generalizations of von Neumann results, called minimax theorems,
became objects of study in their own right. These generalizations focuss on
various directions. Some of them pay attention on topological conditions, other
on the study of weak convexity conditions (see [23]). At the same time mini-
max inequalities have turned out to be a powerful tool in other fields: see, for
instance, [4,5,12,13,15,16,18,21,22].
In this work, Sect. 2, we illustrate the applicability of a minimax inequality
to analyse the existence of a solution for a quite general variational inequalities
system. After that, in Sect. 3, we explore some new generalizations of minimax
theorems with weak convexity conditions.
For the first aim we analyse a class of system which arises in many situations.
To evoke one of them, let us recall that the study of variational equations with
constraints emerges naturally, among others, from the context of the elliptic
boundary value problem, when their essential boundary conditions are treated
as constraints in their standard variational formulation. This leads one to its
variational formulation, which coincides with the system of variational equations:

z ∈ Z ⇒ f (z) = a(x0 , z)
find x0 ∈ X such that ,
y ∈ Y ⇒ g(y) = b(x0 , y)
Partially supported by project MTM2016-80676-P (AEI/FEDER, UE) and by Junta
de Andalucı́a Grant FQM359.
https://doi.org/10.1007/978-3-030-21803-4_52
Minimax Inequalities and Variational Equations 519
for some Banach spaces X and Y , a closed vector subspace Z of X, some con-
tinuous bilinear forms a : X × X −→ R and b : X × Y −→ R, and f ∈ X ∗ and
g ∈ Y ∗ (“∗ ” stands for “topological dual space”): see the details, for instance, in
[10, Sect. 4.6.1]. In a more general way, we deal with the following problem: let X
be a real reflexive Banach space, N ≥ 1, and suppose that for each j = 1, . . . , N ,
Yj is a real Banach space, yj∗ ∈ Yj∗ , Cj is a convex subset of Yj with 0 ∈ Cj , and
aj : X × Yj −→ R is a bilinear form satisfying yj ∈ Cj ⇒ aj (·, yj ) ∈ X ∗ ; then
⎧
⎨ y1 ∈ C1 ⇒ y1∗ (y1 ) ≤ a1 (x0 , y1 )
find x0 ∈ X such that ··· . (1)
⎩ ∗
yN ∈ CN ⇒ yN (yN ) ≤ aN (x0 , yN )
This kind of variational system is so general that it includes certain mixed varia-
tional formulations associated with some elliptic problems, those in the so-called
Babuška–Brezzi theory (see, for instance [3,9] and some of its generalizations [7]).
2 Variational Equations for Reflexive Spaces

Now, we focus on deriving an extension of the Lax–Milgram theorem as well as
a characterization of the solvability of a system of variational equations, using
as a tool a minimax inequality.
To this aim we evoke the minimax inequality of von Neumann–Fan, a par-
ticular case of [6, Theorem 2]:
Theorem 1. Let us assume that X and Y are nonempty and convex subsets
of two real vector spaces. If in addition X is a topological compact space and
f : X × Y −→ R is concave and upper-semicontinuous on X and convex on Y ,
then
max inf f (x, y) = inf max f (x, y).
x∈X y∈Y y∈Y x∈X
As a first application of minimax inequalities on variational equations we

show this version of the Lax–Milgram lemma first appeared in [20]. As usual, +
denotes “positive part”.
Theorem 2. Let E be a real reflexive Banach space, F be a real normed space,
y0∗ ∈ F ∗ , a : E × F −→ R be bilinear and let C be a nonempty convex subset of
F such that for all y ∈ C, a(·, y) ∈ E ∗ . Then
there exits x0 ∈ E : y ∈ C ⇒ y0∗ (y) ≤ a(x0 , y) (2)
if, and only if,
there exists α > 0 : y ∈ C ⇒ y0∗ (y) ≤ αa(·, y). (3)
Moreover, if one of these equivalent statements is satisfied and, for some y ∈ C,
we have that a(·, y) = 0, then

y0∗ (y)
min{x0 : x0 ∈ E such that y ∈ C ⇒ y0∗ (y) ≤ a(x0 , y)} = sup .
y∈C, a(·,y)=0 a(·, y)
+
(4)
520 M. I. Berenguer et al.
Proof. The fact that (2) ⇒ (3) is straightforward. On the other hand, let α > 0
in such a way that (3) holds. Then we apply the minimax theorem, Theorem 1
to the convex sets X := αBE , Y := C and the bifunction
f (x, y) := a(x, y) − y0∗ (y), ((x, y) ∈ X × Y ),
where BE stands for the unit closed ball on E. We arrive at
max inf (a(x, y) − y0∗ (y)) = inf max (a(x, y) − y0∗ (y)).
x∈αBE y∈C y∈C x∈αBE
But the right-hand side term of this equality is nonnegative, since
inf max (a(x, y) − y0∗ (y)) = inf (αa(·, y) − y0∗ (y))
y∈C x∈αBE y∈C
and according to (3). Therefore, the left-hand side term is also nonnegative, i.e.,
there exists x0 ∈ E –in fact, x0 ∈ αBE –
y ∈ C ⇒ y0∗ (y) ≤ a(x0 , y).
To conclude, the fact that x0 can be choosen in αBE implies the stability
condition (4).
Now we show a more sophisticated application of the minimax theorem to

state a characterization of the solvability of the system of variational inequalities
(1). Let us first note that if such a system admits a solution x0 ∈ X, then, for
N

all (y1 , . . . , yN ) ∈ Cj we have that (add the N equations and take γ := x0 )
j=1

N
N

∗
yj (yj ) ≤ γ aj (·, yj )
.
j=1 j=1
The next result establishes that this necessary condition is also sufficient (see [8,
Theorem 2.2, Corollary 2.3]).
Theorem 3. Let E be a real reflexive Banach space, N ≥ 1, F1 , . . . , FN be real

Banach spaces, and for each j = 1, . . . , N , let Cj be a convex subset of Fj with
0 ∈ Cj and aj : E × Fj −→ R be a bilinear form such that
yj ∈ Cj ⇒ aj (·, yj ) ∈ E ∗ .
Then, the following assertions are equivalent:

(i) For all y1∗ ∈ F1∗ , . . . , yN
∗
∈ FN∗ there exists x0 ∈ E such that
⎧
⎨ y1 ∈ C1 ⇒ y1∗ (y1 ) ≤ a1 (x0 , y1 )
··· . (5)
⎩ ∗
yN ∈ CN ⇒ yN (yN ) ≤ aN (x0 , yN )
(ii) There exists ρ > 0 such that

N N
N
(y1 , . . . , yN ) ∈ Cj ⇒ ρ
yj ≤ aj (·, yj )
.
j=1 j=1 j=1
Moreover, if one of these equivalent statements holds, then
ρx0 ≤ max yj∗ .

j=1,...,N
The next example illustrates the applicability of Theorem 3:
Example 1. Given μ ∈ R and f ∈ Lp (0, 1), (1 < p < ∞), let us consider the
boundary value problem:

−z + μz = f on (0, 1)
. (6)
z(0) = 0, z(1) = 0
It is not difficult to prove that its mixed variational formulation is given as

follows: find (x0 , z0 ) ∈ X × Z such that

y ∈ Y ⇒ a(x0 , y) + b(y, z0 ) = y0∗ (y)
,
w ∈ W ⇒ c(x0 , w) + d(z0 , w) = w0∗ (w)
where
X := W 1,p (0, 1), Y := W 1,q (0, 1), Z := Lp (0, 1), W := Lq (0, 1),
the continuous bilinear forms a : X × Y −→ R, b : Y × Z −→ R, c : X × W −→ R

and d : Z × W −→ R defined for each x ∈ X, y ∈ Y, z ∈ Z and w ∈ W as
1
a(x, y) := xy,
0
1
b(y, z) := y z,
0
1
c(x, w) := x w
0
and 1
d(z, w) := −μ zw,
0
and the continuous linear forms y0∗ ∈ Y ∗ and w0∗ ∈ W ∗ given by
y0∗ (y) := 0, (y ∈ Y )
and 1
w0∗ (w) := − f w, (w ∈ W ).
0
Now Theorem 3 applies, since this system adopts the form of (5) with N = 2,
the reflexive space E := (X × Z)∗ , the Banach spaces F1 := Y , F2 := W , the
convex sets C1 := F1 , C2 := F2 , the continuous bilinear forms a1 : E ∗ ×F1 −→ R
and a2 : E ∗ × F2 −→ R defined at each (x, z) ∈ E ∗ , y ∈ F1 and w ∈ F2 as
a1 ((x, z), y) := a(x, y) + b(y, z)
and
a2 ((x, z), w) := c(x, w) + d(z, w),
and the continuous linear forms y1∗ := y0∗ and y2∗ := w0∗ . This mixed variational
formulation admits a unique solution (x, z) ∈ E = (X × Z)∗ as soon as |μ| < 0.5:
see [8, Example 2.4] for the details.
Let us mention that the boundary problem in the preceding example does
not fall into the scope of the Babuška–Brezzi theory, or even the more general
one of [7], where the analysis of Theorem 3 is done by means of independent
conditions of the involved bilinear forms.
The abstract uniformity in the Theorem 3 allow us to state a Galerkin scheme
for the system of inequalities under study, when the convex sets Cj coincide with
the space Fj and the bilinear forms are continuous (see [8]).
Let us also emphasize to conclude that the numerical treatment of some
inverse problems related to the systems of variational equalities under consider-
ation has been developed in [14].
3 Minimax Inequality Under Weak Conditions
In the previous section we have shown some applications of the von Neumann–
Fan minimax inequality to variational analysis. Now we focus on the study of
minimax inequalities. With the aim of deriving more general applications, a wide
variety of this kind of results has appeared in the last decades. Most of them
involves a certain concept of convexity and some topological conditions.
Let us first recall that a minimax inequality is a result guaranteeing that,
under suitable hypotheses, a function f : X × Y −→ R, with X and Y nonempty
sets, satisfies the inequality
inf sup f (x, y) ≤ sup inf f (x, y), (7)

and therefore, the equality also holds, since the opposite inequality is always
true.
Note that when X is a compact topological space and f is upper semicon-
tinuous on X, the inequality (7) can be written as in Theorem 1.
Our starting point is the generalization of upper-semicontinuity introduced
in ([1, Definition 8]) : if X is a nonempty topological space, Y is a nonempty
set, x0 ∈ X and inf y∈Y supx∈X f (x, y) ∈ R, let us recall that f is infsup–
transfer upper semicontinuous in x0 if, for (x0 , y0 ) ∈ X × Y , f (x0 , y0 ) <
inf y∈Y supx∈X f (x, y) implies that there exist y1 ∈ Y and a neighborhood U ⊂ X
of x0 such that f (x, y1 ) < inf y∈Y supx∈X f (x, y) for all x ∈ U . In addition, f is
said to be infsup–transfer upper semicontinuous on X when it is at each x0 ∈ X.
We also assume the following concept of convexity introduced in [11] whithout
a nomenclature.
Given X and Y nonempty sets, a function f : X × Y −→ R is said to be
infsup–convex on Y provided that

m

m ≥ 1, t ∈ Δm
⇒ inf sup f (x, y) ≤ sup tj f (x, yj ),
y1 , . . . , ym ∈ Y y∈Y x∈X x∈X j=1
and supinf-concave on X when

n

n ≥ 1, s ∈ Δn
⇒ inf si f (xi , y) ≤ sup inf f (x, y).
x1 , . . . , xn ∈ X y∈Y x∈X y∈Y
i=1
Assuming some topologial hypotheses such that X is compact and f is infsup–

transfer upper semicontinuous on X, we characterize when f satisfies the min-
imax inequality [2]. This characterization is given under hypotheses of supinf-
concavity of f over some finite subsets of Y and in terms of the infsup-convexity
of f on Y . In addition, such a characterization extends [19, Theorem 2.15] and
even the two-function minimax theorem [18, Corollary 3.11] when the two func-
tions coincide.
Secondly, let us mention another kind of inequality, which was the origin of
the study of equilibrium problems: for a function f : X × X −→ R with X
a nonempty set, a Fan minimax inequality is a result stating, under adequate
hypotheses, the validity of the inequality
inf f (x, x) ≤ sup inf f (x, y).

x∈X x∈X y∈X
Let us notice that if X is a nonempty topological space, inf x∈X f (x, x) ∈ R

and x0 ∈ X then f is inf–diagonally transfer upper semicontinuous on the first
variable at x0 if, for (x0 , y0 ) ∈ X × X, f (x0 , y0 ) < inf x∈X f (x, x) implies that
there exist y1 ∈ X and a neighborhood U ⊂ X of x0 such that f (x, y1 ) <
inf x∈X f (x, x) for all x ∈ U . In addition, f is said to be inf–diagonally transfer
upper semicontinuous on the first variable when it is at each x0 ∈ X.
Under the same type of conditions above on finite subsets of X, we char-
acterize the Fan minimax inequality in terms of inf-diagonally convexity of f .
Recall that f : X × X −→ R is said to be inf–diagonally convex on its second
variable ([17, Definition 2.1]) provided that

m

m ≥ 1, t ∈ Δm
⇒ inf f (x, x) ≤ sup tj f (x, yj ).
y1 , . . . , ym ∈ Y x∈X x∈X j=1
4 Conclusions
In this work, we characterize the existence of solution for a system of variational
equations in some reflexive Banach spaces in terms of the existence of a certain
scalar. Our main tool is a classical minimax theorem. The family of considered
systems includes those appearing in the Babuŝka–Brezzi theory. We illustrate our
results with a non-Hilbert example. In addition we mention sufficient conditions
of convex and topological nature, which guarantee the validity of some minimax
inequalities.
References
1. Baye, M., Tian, G., Zhou, J.: Characterizations of the existence of equilibria in
games with discontinuous and nonquasiconcave payoffs. Rev. Econ. Stud. 60, 935–
948 (1993)
2. Berenguer, M. I., Gámez, D., Garralda-Guillem, A. I., Ruiz Galán, M.: A discrete
characterization of the solvability of equilibrium problems. Submitted for publica-
tion
3. Boffi, D., et al.: Mixed Finite Elements, Compatibility Conditions and Applica-
tions. Lecture Notes in Mathematics, vol. 1939. Springer-Verlag, Heidelberg (2008)
4. Borwein, J.M., Giladi, O.: Some remarks on convex analysis in topological groups.
J. Convex. Anal. 23, 313–332 (2016)
5. Deng, X.T., Li, Z.F., Wang, S.Y.: A minimax portfolio selection strategy with
equilibrium. Eur. J. Oper. Res. 166, 278–292 (2005)
6. Fan, K.: Minimax theorems. Proc. Nat. Acad. Sci. USA 39, 42–47 (1953)
7. Garralda Guillem, A.I., Ruiz Galán, R.: Mixed variational formulations in locally
convex spaces. J. Math. Anal. Appl. 414, 825–849 (2014)
8. Garralda Guillem, A.I., Ruiz Galán, R.: A minimax approach for the study of
systems of variational equations and related Galerkin schemes. J. Comput. Appl.
Math. 354, 103–111 (2019)
9. Gatica, G.N.: A Simple Introduction to the Mixed Finite Element Method Theory
and Applications. Springer Briefs in Mathematics. Springer, Cham (2014)
10. Grossmann, C., Roos, H.G., Stynes, M.: Numerical Treatment of Partial Differen-
tial Equations. Springer-Verlag, Heidelberg (2007)
11. Kassay, G., Kolumbán, J.: On a generalized sup-inf problem. J. Optim. Theory
Appl. 91, 651–670 (1996)
12. Khanh, P.Q., Quan, N.H.: General existence theorems, alternative theorems and
applications to minimax problems. Nonlinear Anal. 72, 2706–2715 (2010)
13. Kenmochi, N.: Monotonicity and compactness methods for nonlinear variational
inequalities, Handbook of differential equations: stationary partial differential
equations, IV, pp. 203–298. Elsevier/North-Holland, Amsterdam (2007)
14. Kunze, H., La Torre, D., Levere, K., Ruiz Galán, M.: Inverse problems via the “gen-
eralized collage theorem” for vector-valued Lax-Milgram-based variational prob-
lems. Math. Probl. Eng. 8 (2015). Article ID 764643
15. Polyanskiy, Y.: Saddle point in the minimax converse for channel coding. IEEE
Trans. Inf. Theory 59, 2576–2595 (2013)
16. Ricceri, B.: On a minimax theorem: an improvement, a new proof and an overview
of its applications. Minimax Theory Appl. 2, 99–152 (2017)
17. Ruiz Galán, M.: A concave-convex Ky Fan minimax inequality. Minimax Theory
Appl. 1, 11–124 (2016)
18. Ruiz Galán, M.: The Gordan theorem and its implications for minimax theory. J.
Nonlinear Convex Anal. 17, 2385–2405 (2016)
19. Ruiz Galán, M.: An intrinsic notion of convexity for minimax. J. Convex Anal. 21,
1105–1139 (2014)
20. Ruiz Galán, M.: A version of the Lax-Milgram theorem for locally convex spaces.
J. Convex Anal. 16, 993–1002 (2009)
21. Saint Raymond, J.: A new minimax theorem for linear operators. Minimax Theory
Appl. 3, 131–160 (2018)
22. Simons, S.: Minimax and Monotonicity. Lecture Notes in Mathematics, vol. 1693.
23. Simons, S.: Minimax theorems and their proofs. In: Minimax and applications.
Nonconvex Optimization and its Applications, pp. 1–23. Kluwer Academic Pub-
lishers, Dordrecht (1995)
Optimization of Real-Life Integrated Solar
Desalination Water Supply System
with Probability Functions
Bayrammyrat Myradov(&)
Ashgabat, Turkmenistan
bbmrdv@gmail.com
Abstract. The possibility of creation of sustainable living activity for small

community in the remote area using solar desalination unit via stochastic pro-
gramming problems is investigated. The Model of System consists of devel-
opment of: (a) stochastic simulation optimization model of integrated solar
desalination water supply system, and (b) financial–economic model of System.
Three stochastic programming problems such as: (a) classical stochastic opti-
mization problem with objective function in mathematical expectation form,
(b) combined chance constrained programming problem, and (c) joint chance
constrained programming problem, are formulated, discussed and solved.
Essential distinctive peculiarities of formulated chance constrained program-
ming problems are (a) correlation between stochastic functions, and (b) logical
functions in technological matrix. As the solution of chance constrained opti-
mization problems approach based on differential evolution algorithm along
with Monte Carlo sampling technique of the chance constraint(s) evaluation is
proposed. The developed optimization models and proposed optimization
approach can be used for making investment decisions under uncertainties.
Keywords: Stochastic programming Global optimization Solar desalination
1 Introduction
The United Nations General Assembly set Sustainable Development Goals in 2015
which cover economic, environmental and social development issues including but not
limited to water, energy and poverty.
And one of the most important problems of modernity is the problem of reliable
water supply to consumers. This problem is especially critical in the remote desert areas
with distributed low-power consumers. Underground saline water resources could be
the source of water for consumers in such areas, but it is necessary to desalinate such
saline water. And integrated solar desalination system can be one of the most admis-
sible technological and economic solution of production of potable water in the remote
desert areas with distributed low-power consumers. So, it is necessary to investigate
attractiveness of investment into the integrated solar desalination water supply system
in the remote desert areas with distributed low-power consumers.

https://doi.org/10.1007/978-3-030-21803-4_53
Optimization of Real-Life Integrated Solar Desalination 527
This requires the promotion of multidisciplinary research across different sectors

such as economy, water, energy. The core of this research should be modeling and
optimization of the real-life system.
Optimization model of integrated solar desalination water supply system should
consider stochasticity of solar desalination unit’s productivity. Therefore, optimization
problems are formulated as Stochastic Programming Problem with Probability Func-
tions (SPP-PFs).
Thus, this paper is devoted to study attractiveness of investment into the integrated
solar desalination water supply system via formulating and solving of SPP-PFs.
2 General Formulation of SPP-PF
In general, SPP-PF can be formulated as:
min Fðx; xÞ ð1Þ

x2X

where: x ¼ ðx1 ; x2 ; . . .; xN ÞT ; X ¼ xjxn 2 xLn ; xU
n ; n ¼ 1; N :
Subject to:
def
P j ðxÞ ¼ P x 2 X : ai; j ~gi; j ðx; xÞ bi; j ; i ¼ 1; Ij ; P j ðxÞ aj ; j ¼ 1; J ð2Þ
Here: x 2 RN is a decision variable vector; x : X ! Rl is a random vector on prob-

ability space ðX; F ; PÞ: The support of x is defined as the smallest closed set N Rl
having the property Pðx 2 NÞ ¼ 1; aj 2 ð0; 1Þ are given level of probability; ~ gi; j ðx; xÞ
are functions; ai; j , xLn and bi; j , xU
n are lower and upper levels of the ~
g i; j ðx; xÞ and
x respectively; N - number of decision variables; Ij - number of the chance constraints
jointed in jth individual chance constraint; J is number of individual chance constraints;
Pfg is a probability function.
Objective function (1) could be deterministic or stochastic. Since last more than
fifty years the studies of SPP-PFs have grown with theoretical developments. Research
on algorithms and applications of these problems has also been very active, especially
in recent years because many practically significant optimization problems in business,
industry, engineering, economics, finance and other fields are reasonable to formulate
with probability functions. We refer to [1–6] as basic references to theory, algorithms
and applications of SPP-PFs.
3 A Model of System
A Model of System consists of: (a) simulation model of water supply system (WSS),
and (b) financial-economic model of System (FEMS). WSS and FEMS will be con-
sidered in monthly (s) and annual (t) frames, respectively. Indexes s and t correspond to
indexes i and j in Sect. 2.
528 B. Myradov
3.1 Simulation Model of Integrated Solar Desalination Water Supply

System
The real-life WSS requires the development of simulation model.
Assumptions are: (1) there is some source of support of living activity (source of
revenue) in small community; (2) the sources of water supply for consumers are:
(a) underground saline water which should be desalinated, (b) collection of rainfall, and
(c) delivery of potable water by a water truck (WT) from an industrial zone; (3) there
are two kind of consumers of potable water: Consumer A, for example, people; and
Consumer B, for example, sheep farm; (4) only Consumer B can generate the revenue.
Decisions variables are x1 is area of solar desalination unit (SDU), m2; x2 is special
area of rainfall collection not related with decision variables x1 ; x3 ; x4 , m2; x3 is volume
of reservoir-storage, m3; x4 is initial number of Consumer B.
WSS can produce potable water in the volume:
XN
Qs;t ðx; xÞ ¼ ðqs;t ðxÞ þ rs;t ðxÞÞx1 þ rs;t ðxÞ k x þ kA CA
n¼2 n n
ð3Þ
Here: qs;t ðxÞ – productivity of solar desalination unit; rs;t ðxÞ – rainfall; CA is number
of Consumer A; and kn ; kA – the coefficients of collecting rainfall. There are monthly
correlations between qs;t ðxÞ and rs;t ðxÞ.
The volume of potable water which can enter to a reservoir-storage at demand bs;t :

zs;t ðx; xÞ ¼ max 0; min Qs1;t ðx; xÞ bs1;t ; x3 ; s ¼ 2; 12; t ¼ 1; T ð4Þ
Considering of annual recurrence:

z1;t ðx; xÞ ¼ max 0; min Q12;t1 ðx; xÞ b12;t1 ; x3 ; t ¼ 2; T ð5Þ
Initially: z1;1 ¼ 0
Numbers of the runs by water truck (WT) for satisfying of water demands will be:

0; if Qs;t ðx; xÞ þ zs;t ðx; xÞ bs;t
Ms;t ðx; xÞ ¼ ð6Þ
ms;t ðx; wÞ; if Qs;t ðx; xÞ þ zs;t ðx; xÞ\bs;t
where: ms;t ðx; wÞ ¼ dðbs;t Qs;t ðx; xÞ zs;t ðx; xÞÞ=Ve, here V is WT’s tank volume.
Then the rest of water in the tank of the WT is:
zrs;t ðx; xÞ ¼ Ms;t ðx; xÞ V ðbs;t Qs;t ðx; xÞ zs;t ðx; xÞÞ; s ¼ 1; 12; t ¼ 1; T ð7Þ
Then the volume of water in reservoir-storage will be:

n o
Zs;t ðwÞ ¼ min zs;t ðx; xÞ þ zrs;t ðx; xÞ; x3 ; s ¼ 1; 12; t ¼ 1; T ð8Þ
Aggregate demand of potable water can be calculated by:

0 8 1
< dw in winter
bs;t ¼ Ds @d A C A þ ys;t dsf in spring and fall A ð9Þ
:
dsa in summer
Here: Ds is number of days in each month; ys;t is quantity of Consumer B; d A ; dw; dsf
and dsa are demand per day per Consumer A; and per day per Consumer B in certain
season respectively.
The model (3)–(9) does not allow to have deficiency of water, i.e. demand will be
satisfied always thanks to presence of WT. This is very important in remote arid area.
3.2 Financial–Economic Model of System

The initial investments ðInv0 Þ to projecting system is:
XN
Inv0 ¼ c0 þ c x
n¼1 n n
ð10Þ
Here: c0 is initial investment not related with xn , and cn are initial investments related
with xn .
Annual net cash flows ðNCFt ðx; xÞÞ after paid taxes rtax are:
NCFt ðx; xÞ ¼ ð1 rtax Þ Revt ðx; xÞ Exst ðx; xÞ ð11Þ
Annual revenue of system ðRevt ðx; xÞÞ is defined as:
Revt ðx; xÞ ¼ a4s;t ðxÞx4 ð12Þ
Annual expenditures ðExst ðx; xÞÞ are defined as:

XN X12
Exst ðx; xÞ ¼ e0;t þ e x þ
n¼1 n;t n
eWT ðxÞMs;t ðx; xÞ
s¼1 s;t
ð13Þ
Here: e0;t is independent yearly expenditure; en;t is yearly expenditure related with xn ;
eWT
s;t is expenditure related with runs of WT.
Analysis and discussion of this model is a subject of separate paper in economics.
530 B. Myradov
4 The Optimization Problems

4.1 Case 1: Objective Function Is Mathematical Expectation
In this case an optimization problem with discount rate dr formulated as:
X
T NCFt ðx; xÞ
max FðxÞ ¼ max Ex fNPVðx; xÞg ¼ max Ex t¼1 ð1 þ dr Þt
Inv0 ð14Þ
x2X x2X x2X
Subject to:
0 xn xU
n ð15Þ
Objective function (14) is chosen as a maximization of expected average Ex of Net

Present Value (NPV) which is most widely applied decision-making criteria in financial
models.
The problem (3)–(15) was solved by four algorithms: stochastic quasi-gradient
algorithm (SQA) [7], differential evolution (DE) [8], particle swarm optimization
(PSO) [9], random search (RS) [10].
Input data and relations are presented in Appendix.
The obtained results with number of simulations 104 are presented in Table 1.
Table 1. Results of optimization for Case 1.

Algorithms x1 x2 x3 x4 F, $103
SQA 4788 2000 142.0 3500 921.5
PSO 5260 2000 143.2 3500 949.8
RS 5260 2000 159.6 3500 950.1
DE 5214 2000 145.6 3500 950.9
The results for objective function (14) are close enough among themselves although
structure of solutions is different. This shows that such problems should be solved by
several algorithms and then deep analysis of the results is required.
The certain difference of the result by SQA from other algorithms can be explained
by flatness of objective function in zone of extremum; by presence of ravines, and
many false local extrema because of empirical approximations of objective function
(14). SQA behaves also “badly” if objective function has such “bad” peculiarities as
flatness, ravines and false extrema. For instance, the numerical experiments with
various initial values of x1 between 5000 m2 and 6000 m2 have shown that objective
function (14) does not reduce its value. When process of optimization begins with x1 =
4500 m2 it reaches 4788 m2.
The best result is obtained by DE algorithm (DEA). Therefore, below some other
results obtained by using this algorithm will be discussed.
The calculations show that minimal value of (14) is $831077.8 and its maximal
value is $1076427.0. Spread of values of (14) is 29.5%, and 76.9% of (14) is in
[$930000; $1005000].
The numerical experiments also show that the surplus of water outside of the
optimized volume of a reservoir-storage (x3 = 145.6 m3) will be in 22% of simulations.
The maximal surplus of water for month can reach 1092.54 m3. It is found out that it
can happen in April with rainfall above 97 mm. The probability of such event is
extremely small, 0.0008%. But still it is important to note, that such event can occur.
The probability of surpluses of water up to 60 m3 is 11.65%. The probability of
surpluses of water above 500 m3 is 0.77%.
Also, it is very important to know number of runs by WT for satisfying demands.
These minimally are 47 runs, maximally are 224 runs for a year. The most probable
number of runs is from 100 up to 180 runs for a year, 83%. In January number of runs
can reach 34.
4.2 Case 2: Combined Chance Constrained Programming Problem

Frequent occurrence of surplus of water outside of the reservoir-storage that was
demonstrated above requires solution of this issue.
Therefore, to the optimization problem (3)–(15) adds combined chance constraints:

P Qs;t ðxÞ þ Zs;t ðxÞ x3 bs;t ðxÞ; s ¼ 1; 12 ; t ¼ 1; T a ð16Þ
Problem (3)–(16) is chance constrained programming problem (CCPP) with stochastic

objective function (14) in form of mathematical expectation Ex . Chance constraints
(16) are combined: constraints are joined by months and individually by years.
This problem becomes more complicated than standard CCPP because: (a) the
technological matrix (TM) cannot be presented in a classical form because of the
stochastic simulation model of WSS, and (b) there is a correlation between productivity
of SDU and rainfall.
Nowadays there is no algorithm for finding the global optimum of problem (3)–(16)
with guarantee even if such optimum exists. But theoretically proving the existence of
such extremum is also very difficult task. If the problem has many extrema, and they
close each other at that, moreover if it has many local, particularly false extrema then
solving such problem is challenge for the researcher. Therefore, obtaining near optimal
or even good-enough solution of such complex problems as the problem (3)–(16) often
will be enough for decision makers. For example, in this case it is very important to
understand whether it is possible to create attractive business project for investments or
no.
For efficient solving of CCPP, we should also correctly evaluate chance constraint(s)
that is quite difficult separate task.
The process of the chance constraint(s) evaluation may be presented as Bernoulli
trial process.
532 B. Myradov
b the number of satisfying of Pf~gðx; xÞ 0g a in a fixed number of

Denote by ns
trials Ntr and by â an empirical estimation of a. Then:
ð17Þ
Here: is the characteristic function, set A be the set of the events

when the constraint ~gðx; xÞ 0 holds true.
And
â ¼ ns=N
b tr ð18Þ
This means Pf~gðx; xÞ 0g a can be replaced by â a because according to the Law

of Large Numbers â tends to a with probability 1 as Ntr ! 1. Moreover, as it is well
known, â converges with probability 1 to a uniformly on any compact subset of X.
Denote by Nf the number of fails to satisfy of the constraint Pf~ gðx; xÞ 0g a in a
fixed number of trials Ntr . In order to â a there is a need to:
Nf dNtr ð1 aÞe ð19Þ
This inequality is used for evaluation of the chance constraint(s) by computer simu-
lation via Monte Carlo sampling technique.
This approach of the chance constraint(s) evaluation has several advantages such as
simplicity, independency from distribution function of random variables, capacity for
work with random variables in TM and right hand side (RHS) as well as with correlated
random variables. It is also computationally not so expensive because we do not need
to simulate all Ntr continuously. If during the simulating process the number of fails
will achieve Nf for given Ntr and a then the process of evaluation of chance constraint
(s) will be finished. Different approaches of estimation of Ntr are well known. Some of
theirs are discussed in [11].
Classical DEA is often used for solution of complex real-life optimization problems
though it is still theoretically not proved that this algorithm holds global convergence.
With the purpose of conducting the comparative analysis and providing an addi-
tional information to decision maker, problem (3)–(16) have been solved for a = 0.95,
a = 0.99 and a = 0.999999 by classical DEA. Some results of these calculations are
presented in Table 2. These results show peculiarities of presence of chance constraint
(16). Above all these results convincingly show importance of considering of chance
constraint (16).
Table 2. Basic results for Case 2.

Variables a = 0.95 a = 0.99 a = 0.999999
x1 4799 4212 4083
x2 902 1064 535
x3 850 964 1535
x4 3500 3500 3500
Annual average of surplus water 8.8 1.2 0
Annual maximal amount of surplus water 1306.9 773.0 0
F, $103 774.3 663.7 420.6
Minimal F, $103 360.8 248.6 14.7
Maximal F, $103 1304.4 1253.0 1033.0
The results show minimal value of objective function at a = 0.999999 is close to

zero though its average value is more than 400 thousand US dollars.
4.3 Case 3: Joint Chance Constrained Programming Problem

In this case instead of combined chance constraint (16) has been applied joint chance
constraint:

P Qs;t ðx; xÞ þ Zs;t ðx; xÞ x3 bs;t ; s ¼ 1; 12; t ¼ 1; T a ð20Þ
The problem (3)–(15), (20) is joint chance constrained programming problem (JCCPP).
As a rule, this problem is more complicated than previous one. But this problem is
more practice-relevant than above considered cases because it is required satisfactions
of chance constraint (20) for all time horizon of T.
To use classical DEA for solving this JCCP was very time consuming and required
much computer time. At the same time no guarantee that this algorithm holds global
convergence. Modified DEA with global convergence in probability was proposed in
[11]. This algorithm was used for solving the problem (3)–(15), (20). It was proved in
[12] that DEA with modified mutation vectors based on subspace clustering mutation
operator holds global convergence.
Modified mutation vectors vM n;iNP ;g are generated as [11]:

vM
vn;iNP ;g
if r1 2 rand ð1; bNPð1 þ Rb cÞÞ NP
n;iNP ;g ¼ xn;Rbtop ;g þ randð0; 1Þ xn;b1 ;g xn;b2 ;g otherwise
ð21Þ
where: NP is the population size; Rb is the increasing factor of the random integer r1
region; xn;Rbtop ;g is an individual selected by randomly sampling from the top Rb of the
gth population; xn;b1 ;g and xn;b2 ;g are two boundary individuals, each element of which
is equal to the upper or lower boundary value with an equal probability.
534 B. Myradov
Table 3. The latest 5 best iterations of solving JCCPP.

x1 x2 x3 x4 â F; $103
1 4379 29.15 924.8 3500 0.99061 598.6
2 4950 40.58 1100 3500 0.99109 607.6
3 5024 0 1114 3500 0.99081 609.3
4 4973 0 1092 3500 0.99006 611.5
5 4998 0 1099 3500 0.99000 611.8
The latest 5 results of solving the problem at a = 0.99 and Ntr = 105 are presented in
Table 3.
The classical approach has been used for control parameters of DEA: the popu-
lation size NP, mutation factor, and the crossover rate.
These results show the structure of decision variables is changing considerably
during the latest 5 best iterations though objective function is changing 2.2% only. This
means decision maker should pay attention not only on value of objective function but
also to the structure of decision variables.
5 Conclusions
A Model of System which consists of (a) stochastic simulation model of water supply
system, and (b) financial-economic model of System is developed.
Three stochastic programming problems are formulated and solved.
First of these problems is a classical stochastic programming problem with
objective function in mathematical expectation form. This problem was solved by 4
difference algorithms because objective function (14) has flatness, ravines, and many
local, particularly false extrema. This problem also helps to understand necessity of
formulation chance constrained programming problem because numerical experiments
show surplus of water. At the same time this problem has another practical importance
also: what we can do with surplus of water, may be this is another source of revenue?!
This is task for another study.
But in this study we do not want to have surplus of water with certain probabilities.
Therefore, second problem was formulated as combined CCPP. This problem was
solved by classical DEA together with Monte-Carlo sampling technique of the chance
constraint(s) evaluation.
Third problem was formulated as JCCPP with toughening of constraint for surplus
of water. This problem was solved by modified DEA together with Monte-Carlo
sampling technique of the chance constraint(s) evaluation.
The results of numerical experiments of these three problems demonstrate:
• Attractiveness of investment into business projects related to organization of sus-
tainable living activity for the small community in the remote desert areas by using
integrated solar desalination system which is important in accordance with UN

Sustainable Development Goals;
• Comparison of all three Cases makes noticable that the presence of chance con-
straints leads to decreasing value of objective function (14). This means that Case 1
is more optimistic, and Case 3 is more realistic under the developed model.
• It is better to use several different algorithms, particularly derivative free algorithms
such as DEA, for solving the real-life stochastic programming problems because as
a rule the functions have flatness, ravines, and many local, particularly false
extrema;
• Proposed modified DEA together with Monte-Carlo sampling technique of the
chance constraint(s) evaluation is effective approach for solving complex SPP-PFs
such as combined and joint CCPPs;
• Developed model and proposed approach of solving SPP-PSs is practice-relevant
and useful to decision maker.
Appendix
The codes have been written on FORTRAN 95. The calculations are executed on PC
Intel Core 2 Duo CPU 2.2 GHz, 3 Gb RAM. Operation system is Windows 7 (32).
There are following input data and the relations:

n ¼ 7500 m ; 2000 m ; 2850 m ; 3500 ; kn ¼ f1:3; 1:0; 0:2; 1:1g;
N ¼ 4; T ¼ 10; xU 2 2 2
c0 ¼ $50000; dr ¼ 5%; rtax ¼ 2:5%; kCB s ¼ f1; 1; 1:05; 1:1; 1:15; 1:2; 1:25; 1:3; 1:4; 1:3; 1:2; 1:1g

C A ¼ 10; cn ¼ $105=m2 ; $5=m2 ; $0:25=L; $170 ; d A ¼ 100 L; kA ¼ 100; V ¼ 10 m3
dw ¼ 3 L; dsa ¼ 4L; ds ¼ 5L; en;t ¼ f$7:35=m2 ; 0:15=m2 ; $0:0125=L; $12:1 per Consumer Bg;
e0;t ¼ $2500; 8t; e1;1 ¼ $575; ktWT ¼ f1:02; 1:01; 1:0; 1:01; 1:01; 1:01; 1:01; 1:01; 1:02g;
s ¼ 1; 12; t ¼ 2; T; a4s;t ¼ 127:5ktC 8s; ktC ¼ ktWT ; t ¼ 2; T; a4s;1 ¼ 08s;
eWT s;t1 ; y1;1 ¼ x4 ; ys;t ¼ kCB s ys1;t ; s ¼ 2; 12:
WT
s;t ¼ kt eWT
SRCs ¼ f0:92; 0:9; 0:87; 0:85; 0:81; 0; 0; 0; 0; 0:87; 0:91; 0:93g
Ds ¼ f31; 28; 31; 30; 31; 30; 31; 31; 30; 31; 30; 31g:
Monthly productivity of solar desalination unit will be represented with normal

distribution: qs;t ðxÞ @ðqs ; ri Þ; ðqs ; rs Þ = {(8.73,2.36); (8.28,1.35); (18.52,2.91);
(63.1,3.23); (122.8,3.03); (139.4,4.53); (167.6,2.7); (113.1,3.82); (82.67,1.72);
(44.68,1.66); (19.63,0.91); (10.46,1.21)} L/m2 per month.
Monthly rainfall in a local area rs;t ðxÞ simulates with using of probabilistic mod-
eling based on study by [13]. The monthly Spearman’s rank correlations (SRCs)
between qs;t ðxÞ and rs;t ðxÞ are determined on basis of experimental data on monthly
productivity of solar desalination unit and meteorological data of rainfall. Dependent
random bivariate variables qs;t ðxÞ and rs;t ðxÞ simulate knowing SRCs [14].
536 B. Myradov
References
1. Ermoliev, Y.: Methods of Stochastic Programming. Nauka, Moscow (1976). (In Russian)
2. Prékopa, A.: Stochastic Programming. Kluwer, Dordrecht, Boston (1995)
3. Birge, J.R., Louveaux, F.: Introduction to Stochastic Programming. Springer-Verlag, New
Vork (1997)
4. Kall, P., Mayer, J.: Stochastic Linear Programming: Models, Theory, and Computation.
5. Shapiro, A., Dentcheva, D., Ruszczyński, A.: Lectures on Stochastic Programming:
Modeling and Theory. MPS-SIAM Series on Optimization, Philadelphia (2009)
6. Kibzun, A.I., Kan, Yu, S.: Stochastic Programming Problems with Probability Criteria.
Fizmatlit, Moscow (2009) (in Russian).
7. Mirzoahmedov, F., Uryasyev, S.P.: Adaptive stepsize rule for the stochastic optimization
algorithm. Comput. Math. Math. Phys. 23(6), 1314–1325 (1983). (in Russian)
8. Price, K.V., Storn, R.M., Lampinen, J.A.: Differential Evolution: A Practical Approach to
Global Optimization. Springer, Berlin (2005)
9. Chen, X., Li, Y.: On convergence and parameter selection of an improved particle swarm
optimization. Int. J. Control Autom. Syst. 6(4), 559–770 (2008)
10. Zhigljavsky, A., Zilinskas, A.: Stochastic Global Optimization. Springer, Berlin (2008)
11. Myradov, B.: Optimization of stochastic problems with probability function via differential
evolution, http://www.optimization-online.org/DB_HTML/2017/11/6341.html
12. Hu, Z., Xiong, S., Wang, X., Su, Q., Liu, M., Chen, Z.: Subspace clustering mutation
operator for developing convergent differential evolution algorithm. Math. Probl. Eng.
Article ID 154626, 1–18 (2014)
13. Seyitkurbanov, S., Fateeva, G.S., Ryhlov, A.B., Sergeev, V.A.: Collection of atmospheric
precipitation in autonomous heliocomplex. Probl. Desert Dev. 4, 74–76 (1984)
14. Devroye, L.: Non-uniform Random Variate Generation. Springer, New York (1986)
Social Strategy of Particles in Optimization
Problems
Bożena Borowska(&)
Institute of Information Technology, Lodz University of Technology,

Wólczańska 215, 90-924 Lodz, Poland
bozena.borowska@p.lodz.pl
Abstract. The article presents a particle swarm optimization algorithm

(SoPSO) in which a novel effective acceleration coefficient has been proposed.
In the presented approach, the proposed acceleration coefficient is a nonlinear
function that depends on the performance of the algorithm and is affected by a
number of iterations. This strategy allows to more precisely specify the search
direction and better control velocity of the algorithm according to which it
travels in the search space to discover the best, optimal solution of the con-
sidered problem. The presented strategy was examined on the collection of
benchmark functions described in the literature. The test results were compared
with those achieved by the improved IPSO algorithm and the standard PSO
(SPSO).
Keywords: Swarm intelligence Socials strategy

Particle swarm optimization Optimization
1 Introduction
The PSO (Particle Swarm Optimization) is a stochastic computational method used in

optimization. It was introduced in 1995 by Kennedy and Eberhart [1] as an alternative
to existing evolutionary optimization method such as genetic algorithms (GA) [2] or
evolutionary strategies (ES). The inspiration of this method was natural environment of
insects and their social behavior [3]. Likewise other evolutionary algorithms, a PSO
method is based on a populations of individuals, which is called a swarm of particles.
In optimization task, each particle represents a point in the search spaces, and can be
one off the possible acceptable solutions of the considered optimizations problem.
Particles, imitating the behavior of animals, move within the search space to look for
the optimal solution [4].
Simplicity, easy implementation of this method (without any encoding process or
evolutionary operators like crossover or mutation) and fast convergence to a reasonably
good solution [5, 6] provides its wide application in theoretical and practical opti-
mization problems such as fuzzy systems, pattern classification, artificial neural net-
work or parameter selection [7–13]. However, next to many benefits, the PSO method
can encounter some problems connected with trapping in local optima, premature
convergence or stopping optimization near the optimum. Several studies have been
undertaken to analyze the characteristics of this method in order to overcome

https://doi.org/10.1007/978-3-030-21803-4_54
538 B. Borowska
difficulties and improve performance of SPSO (Standard Particle Swarm Optimization).

Eberhart and Shi [14] have investigated resources, application and development related
to PSO and proposed a modification of the SPSO by incorporating a factor of inertia
weight [15–17]. Passive congregation introduced to particle swarm optimization was
presented by He et al. [18]. A novel PSO method, in which a repairs procedure was
applied, has been described by Borowska [19]. Modifications of the velocity updating
equation and basic control parameters of PSO have been described in [20–22]. When
reaching the optimum is too time-consuming operation, Fan [23] suggests to add an
effective “speeding-up” procedure that relies on inserting an adaptively scaled
parameter into the traditional PSO method. According to the author, this strategy
increases the convergence rate and provides a satisfying solution with fewer number of
objective function evaluations. A division of a population into sub swarms was rec-
ommended by Jiang et al. [24]. New effective optimization algorithms by incorporating
some decision criteria into PSO was described in [25]. Robinson et al. [26] developed
hybrid algorithms as combinations of GA and PSO. Garg [27] proposed hybrid PSO-
GA but, in contrast to Robinson et al. [26], used GA only for some selected particles.
PSO with genetic operators for random selected particles was also proposed by
Dimopoulos [28]. Mutation operator and a hierarchical self-organizing particle. swarm
optimizer with TVAC (time-varying acceleration coefficients) concept was introduced
by Asanga et al. [29]. PSO combined with GA or its genetic operators can also be
found in [30–33]. Particle swarm optimization algorithms with the Nelderi-Mead
search algorithms was developed by Abdelhalim et al. [34]. To improve the perfor-
mance of PSO, Wang et al. [35] proposed a hybrid algorithm consisting of PSO and SA
(simulated annealing). Modified PSO algorithms with simulated annealing method,
mutation, elements of quantum behavior and co-evolution theory were applied by Liu
and Zhou [36]. Another approach involving the use of fuzzy logic has been studied by
Shii and Eberhartt [37]. The use of fuzzy systems can also be found in [38, 39].
The PSO methods proposed in the literature do not always perform well and get
expected results. The troubles with performance occur especially for complex, multi-
modal problems containing many local optima. One of the reasons is poor supervision
of particle behavior and associated with this a problem of maintaining the right
diversity of the particle swarm. Important factors that affect particles behavior and
diversity are acceleration coefficients. Their proper values can improve performance of
PSO, whereas the unsuitable ones can lead to lack of particles diversity and cause
premature convergence.
This article presents a social particle swarm optimization method (SoPSO) with an
efficient strategy for acceleration coefficient. In the proposed approach, the acceleration
coefficient is a nonlinear function that depends on the performance of the algorithm and
number of iterations. This strategy allows to more precisely specify the search direction
and velocity of the algorithm according to which it travels in the search space to
discover the optimal solution of the considered problem. The proposed strategy was
examined on a collection of nonlinear functions. Next, they test aresults were compared
withi those achieved through the standards PSO and they improved IPSO algorithm
proposed by Jang et al. [24].
Social Strategy of Particles in Optimization Problems 539
2 The Standard PSO Model
The inspiration of the astandard PSO model, created by Kennedy and Eberhart [1, 2],
was natural environment of the swarm of insect such as bees [3]. In practice, the PSO
method works on the population (called “swarm”) of random individuals each of which
is represented as a point of they search space. Members off the populations are called
“particles”. The particles search the space to find the optimum. In an n-dimensional
search space., each particle is represented by two n-dimensional vectors Xj = (xj1,xj2,…,
xjn) and Vj = (vj1,vj2,…,vjn). Vector Xj describes location of the particle in the search
space whereas vector Vj represents a velocity of a particle according to which particle
changes its position. Except location and velocity, particles also possess memory. They
remember their best previously visited locations by Pbj= (pbj1,pbj2,…,pbjn). The best
location from all the particles of the swarm is remembered as Gb = (gb1,gb2,…,gbn). In
each iteration, velocities according to which particles search space, are updated by the
following formula:

Vj ðl þ 1Þ ¼ w Vj ðlÞ þ c1 r1 Pbj Xj ðlÞ þ c2 r2 Gb Xj ðlÞ ð1Þ
New particle locations are calculated according to the equation:
Xj ðl þ 1Þ ¼ Xj ðlÞ þ Vj ðl þ 1Þ ð2Þ
where:
w Parameter called inertia weight,
Pbj The best location (of the particle j),
Gb The best location (in whole swarm),
r1,r2 Random values generated from (0,1),
c1,c2 Acceleration coefficient
Quality off the particles is assessed by means of an evaluation function of optimizations

problems.
As shown in formula (1), the behaviour of the particles depends on three compo-
nents: their previous velocity, “cognitive” part and “social” part [6]. The first part of the
formula (1) decides how much the current velocity of the particle is influenced by its
previous velocity. Individual thinking of each particle is represented by the second part
of (1) called the “cognitive” component. The rule of this component is to encourage the
particles to roam towards regions in which they have found the best location. The third
part called the “social” component expresses particles cooperation. The rule of social
component is to encourage the particles to roam towards regions with the best location
found in a swarm.
540 B. Borowska
3 SoPSO Method
In the presented SoPSO method, the particle swarm optimization with an efficient
strategy for acceleration coefficient is proposed. This is a nonlinear strategy. The
novelty of this approach consists of the introduction a new method for calculating
acceleration coefficients. In each iteration, while searching the space of possible
solutions, particles constantly change their locations. Each of their new position is
assessed and the particles with the minimal and maximal fitness are remembered. On
their basis, and based on the number of iterations, the new acceleration coefficient is
calculated. According to this strategy, the acceleration coefficients are dynamically
changing. This strategy allows to more precisely specify the search direction and
velocity of the algorithm according to which it travels in the search space to discover
the optimal solution of the considered problem. This strategy is described according to
the following Eqs. (3, 4):

Par ¼ g f min f Xj ðlÞ 100 Iter =ðf max Iter maxÞ ð3Þ

Vj ðl þ 1Þ ¼ w Vj ðlÞ þ c1 r1 Pbj Xj ðlÞ þ r2 Par Gb Xj ðlÞ ð4Þ
where Iter, Iter_max represents the current and maximal number of iterations, f_max
(f_min) describes the current maximal (minimal) fitness and g is a random real number
between 0 and 1.
4 Test Results
The proposed method was tested on a set of ten benchmarks functions, four of which
are presented in this article and included in Table 1. The results of the test were
compared with the performances of standards PSO and improved IPSO described in
[24].
Table 1. Optimization test functions.

Function Formula Minimum Range
Rosenbrock P
n1 0 (−30, 30)
f3 ¼ ½100ðxi þ 1 x2i Þ2 þ ðxi 1Þ2
i¼1
Griewank P
n Q
n 0 (−600, 600)
f4 ¼ 4000
1
x2i cos pxiffii þ 1
i¼1 i¼1
Rastrigin n
P 0 (−5.12, 5.12)
f5 ¼ x2i 10 cosð2pxi Þ þ 10
i¼1
2 4
Zakharov P
n P
n P
n 0 (−10, 10)
f6 ¼ xi þ i
2 xi þ i
2 xi
i¼1 i¼1 i¼1
The algorithms worked with the linear decreasing inertia weights, which was
changing from w = 0.9 to 0.4. The attempts were made for four different dimension
sizes n = 10, 20 or 30. The swarm consisted of S = 20, 40 or 80 particles. The
acceleration coefficient used in the computations was equal c1 = 2.0. The total number
of iterations was depended on dimension and was equal to 1000, 1500, and 2000
respectively. For each experiment 50 runs were conducted.
The exemplary results for Griewank, Rastrigin, Zakharov and Rosenbrock func-
tions for S = 20, 40 and 80 particles in the swarm rare shown in Tables 2–5. All
settings regarding the IPSO algorithm were adopted from Jang et al. [24].
Table 2. Mean function values of Griewank function (for 20, 40, 80 particles).
Population size Dimension Number of iterations Average
PSO IPSO SoPSO
20 10 1000 9.20e−002 7.84e−002 6.15e−002
20 1500 3.17e−002 2.36e−002 1.79e−002
30 2000 4.82e−002 1.65e−002 1.12e−002
40 10 1000 7.62e−002 6.48e−002 4.63e−002
20 1500 2.27e−002 1.82e−002 1.25e−002
30 2000 1.53e−002 1.51e−002 9.31e−003
80 10 1000 6.58e−002 5.94e−002 4.79e−002
20 1500 2.22e−002 9.10e−003 5.38e−003
30 2000 1.21e−002 4.00e−004 5.30e−005
Table 3. Mean function values of Rastrigin function (for 20, 40, 80 particles).
PSO IPSO SoPSO
20 10 1000 5.21e+000 3.29+000 3.31e+000
20 1500 2.28e+001 1.64e+001 1.47e+001
30 2000 4.93e+001 3.50e+001 3.61e+001
40 10 1000 3.57e+000 2.62e+000 2.08e+000
20 1500 1.73e+001 1.49e+001 1.03e+001
30 2000 3.89e+001 2.78e+001 2.75e+001
80 10 1000 2.38e+000 1.71e+000 1.25e+000
20 1500 1.29e+001 7.67e+000 6.03+000
30 2000 3.00e+001 1.39e+001 9.89e+000
The exemplary charts representing the mean best fitness in the following iterations
for SoPSO, IPSO and SPSO algorithms, are depicted in Figs. 1–4.
542 B. Borowska
Table 4. Mean function values of Rosenbrock function (for 20, 40, 80 particles).
PSO IPSO SoPSO
20 10 1000 4.26e+001 1.05e+001 9.67e+000
20 1500 8.73e+001 7.57e+001 7.34e+001
30 2000 1.33e+002 9.98e+001 1.02e+002
40 10 1000 2.44e+001 1.24e+000 8.12e−001
20 1500 4.77e+001 8.73e+000 6.49e+000
30 2000 6.66e+001 1.47e+001 1.28e+001
80 10 1000 1.53e+001 1.92e−001 1.43e−001
20 1500 4.06e+001 1.58e+000 1.21e+000
30 2000 6.34e+001 1.54e+000 1.15e+000
Table 5. Mean function values of Zakharov function (for 20, 40, 80 particles).
Population Dimension Number of Average
size iterations PSO IPSO SoPSO
20 10 1000 1.3499e−003 1.1410e−003 1.0026e−004
20 1500 2.3139e+002 2.1745e+002 1.8914e+002
30 2000 5.7685e+002 5.4051e+002 5.1132e+002
40 10 1000 1.4368e−005 1.2513e−005 1.1907e−006
20 1500 1.7789e+002 1.4526e+002 1.3215e+002
30 2000 3.9224e+002 3.3972e+002 3.0422e+002
80 10 1000 2.4349e−008 2.1905e−008 1.9418e−009
20 1500 8.7169e+001 4.9795e+001 2.7951e+001
30 2000 2.3817e+002 1.7842e+002 1.5376e+002
1.1E+03 SoPSO
9.2E+02 SPSO
Average Best Fitness
IPSO
7.2E+02
5.2E+02
3.2E+02
1.2E+02
0 500 1000 1500 2000
Iterations
Fig. 1. The mean best fitness for Zakharov function (for 80 particles and 30 dimensions).
Preliminary outcomes of experimental tests on benchmark functions indicated that

the SoPSO method with their aproposed strategy could achieve superior performance
over traditional SPSO, as well as improved IPSO with the sub-swarms. The introduced
additional parameter helped determine velocities of the particles according to which
they were pulled towards the best particle found in a swarm in order to make the
algorithm more efficient.
1.0E+02
SoPSO
1.0E+01
SPSO

1.0E+00 IPSO
1.0E-01
1.0E-02
1.0E-03
1.0E-04
1.0E-05
0 500 1000 1500 2000
Iterations
Fig. 2. The mean best fitness for Griewank function (for 80 particles and 30 dimensions).
5.0E+02
SoPSO
SPSO
IPSO
5.0E+01
5.0E+00
0 500 1000 1500 2000
Iterations
Fig. 3. The mean best fitness for Rastrigin function (for 80 particles and 30 dimensions).
1.0E+04
SoPSO
1.0E+03 SPSO
IPSO
1.0E+02
1.0E+01
1.0E+00
1.0E-01
0 500 1000 1500 2000
Iterations
Fig. 4. The mean best fitness for Rosenbrock function (for 80 particles and 30 dimensions).
In most cases of the studied functions, the mean function values achieved by
SoPSO were lower than the results found by the remaining algorithms (Tables 2–5)
and only in few cases were comparable to those obtained by IPSO. The SPSO algo-
rithm obtained worse results compared to IPSO, and more often had problems with
trapping into local minima.
In the standard PSO, the particles of the swarm are exclusively focused on sharing
knowledge about their personal best found position and the best position discovered by
swarm. Other information on the particle’s achievements is irrelevant and not taken
544 B. Borowska
into consideration in this approach. Consequently, some useful knowledge is lost,

which leads to the loss of the swarm diversity and premature convergence.
The higher performance of the IPSO method is the result of the division of the
swarm into several smaller ones, which are periodically shuffled to share information.
This approach allows for a better searching the space of the possible solutions, how-
ever, the knowledge about some particle’s achievement is still not considered and is
lost. The IPSO algorithm, despite its better operation compared to SPSO, performed
worse than the SoPSO algorithm but only in few cases had problems with premature
convergence.
In the proposed SoPSO method, information about current minimal and maximal
fitness of the particles and number of iterations is taken into account. This approach
allowed for better control of the particles velocity and increased the performance of
SoPSO compared to the other methods.
The proposed SoPSO algorithm converged faster than the other tested algorithms,
was more stable and was achieving higher quality solutions. In aafew cases, SoPSO
initially converged slightly slower than IPSO, but only by the first three hundred
iterations. However, after 300 iterations, SoPSO converged faster than the rest of the
methods.
5 Conclusion
In this article, a particles swarms optimizations algorithm called SoPSO is presented.

The novelty of the algorithm lies in its new way of determining the acceleration
coefficient. In the proposed approach, the acceleration coefficient is a nonlinear func-
tion that depends on performances of the algorithm and is affected by a number of
iterations. For each particle, in every iteration different new acceleration coefficients are
calculated. The efficiency of the SoPSO method was examined on the benchmark set,
then the outcomes of the simulations were compared with the results achieved by the
SPSO as well as the improved IPSO algorithm.
Preliminary outcomes of experimental tests have indicated that the SoPSO method
is faster, more resistant to the problem of stopping in a non-optimal point of the search
space and achieves betters performance than the other algorithms. It can be used for
solving high dimensional optimization problems.
References
1. Kennedy, J., Eberhart, R.C.: Particle swarm optimization. In: IEEE International Conference
on Neural Networks, pp. 1942–1948. Perth, Australia (1995)
2. Chomatek, L., Duraj, A.: Multiobjective genetic algorithm for outliers detection. In: IEEE
International Conference on Innovations in Intelligent Systems and Applications (INISTA),
pp. 379–384. Gdynia, Poland (2017)
3. Robinson, J., Rahmat-Samii, Y.: Particle swarm optimization in electromagnetic. IEEE
Transact. Anten. Propag. 52(2), 397–407 (2004)
4. Kennedy, J., Eberhart, R.C., Shi, Y.: Swarm Intelligence. Morgan Kaufmann Publishers, San
Francisco (2001)
5. Borowska, B.: Nonlinear inertia weight in particle swarm optimization. In: 12th International
Scientific and Technical Conference on Computer Sciences and Information Technologies
(CSIT), pp. 296–299. IEEE, Ukraine (2017)
6. Ratnaveera, A., Halgamuge, S.K., Watson, H.C.: Self-organizing hierarchical particle swarm
optimizer with time-varying acceleration coefficients. IEEE Trans. Evol. Comput. 8(3), 240–
255 (2004)
7. Mashhadban, H., Kutanaei, S.S., Sayarinejad, M.A.: Prediction and modeling of mechanical
properties in fiber reinforced self-compacting concrete using particle swarm optimization
algorithm and artificial neural network. Constr. Build. Mater. 119, 277–287 (2016)
8. Soszyński, F., Wołowski, J., Stasiak, B.: Music games: as a tool supporting music education.
In: Proceedings of the Conference on Game Innovations, CGI 2016, pp. 116–132 (2016)
9. Yang, X., Yuan, J., Yuan, J., Mao, H.: A modified particle swarm optimizer with dynamic
adaptation. Appl. Mathemat. Computat. 189, 1205–1213 (2007)
10. Nouaouria, N., Boukadounm, M., Proux, R.: Particles swarm classification: as survey and
positioning. Pattern Recognit. 46, 2028–2044 (2013)
11. Kiranyaz, S., Ince, T, Gabbouj, M.: Multidimensional Particles Swarm Optimizations for
Machines Learning and a Pattern Recognition. Adapt. Learn. Optimizat, vol. 15. Springer-
Verlag, Berlin (2014)
12. Ling, S.H., Chan, K.Y., Leung, F.H., Jiang, F., Nguyen, H.: Quality and robustness
improvement for real world industrial systems using a fuzzy particle swarm optimization.
Eng. Appl. Artif. Intell. 47, 68–80 (2016)
13. Jordehy, A.R., Jasni, J.: Parameters selection in particle swarm optimisation: a survey.
J. Exp. Theor. Artif. Intell. 25, 527–542 (2013)
14. Eberhart, R.C., Shi, Y.: Particle swarm optimization: developments, applications and
resources. In: Proceedings of IEEE International Conference on Evolutionary Computation,
pp. 81–86 (2001)
15. Shi, Y., Eberhart, R.C.: A modified particle swarm optimizer. In: Proceedings of IEEE
International Conference on Evolutionary Computation, pp. 69–73 (1998)
16. Shi, Y., Eberhart, R.C.: Parameter selections in particle swarm optimization. In: Proceedings
of the 7th International Conference on Evolutionary Programming, pp. 591–600. New York
(1998)
17. Shi, Y., Eberhart, R.C.: Empirical study of particles swarm optimization. In: Proceedings of
the Congress on Evolutionary Computation, vol. 3, pp. 1945–1950 (1999)
18. He, S., Wu, Q.H., Wen, J.Y., Saunders, J.R., Paton, R.C.: A particle swarm optimizers with
passive congregation. Biosystems 78, 135–147 (2004)
19. Borowska, B.: An improved particles swarm optimization algorithm with prepair procedure.
Advan. Intellig. Syst. Comput. 512, 1–16. Spring. Internat. Publ. (2017)
20. Eberhart, RC., Shi, Y.: Evolving artificial neural network. In: Proceeding of the International
conference on Neural Network and Brain, pp. 5–13. Beijing, China (1998)
21. Borowska, B.: Exponential inertia weight in particles swarm optimization. Advanc. Intellig.
Syst. Comput. 524, 265–275, Springer Internation. iPubl. (2017)
22. Clerc, M., Kennedy, J.: The particle swarm—explosion, stability, and convergence in a
multidimensional complex space. IEEE Transact. Evolut. Comput. 6, 58–73 (2002)
23. Fan, H.Y.: A modification to particle swarm optimization algorithm. Engineering
Computation 19(8), 970–989 (2002)
24. Jiang, Y., Hu, T., Huang, C., Wu, X.: An improved particle swarm optimization algorithm.
Appl. Mathemat. Computat. 193, 231–239 (2007)
546 B. Borowska
25. Borowska, B.: Novel algorithms of particle swarm optimization with decision criteria.
J. Exp. Theor. Artif. Intell. 30, 615–635 (2018)
26. Robinson, J., Sinton, S., Rahmat-Samii, Y.: Particle swarm, genetic algorithm, and their
hybrid: optimizations of a profiled corrugated horn antenna. In: Proceedings of IEEE
International Symposium on Antennas and Propagation, vol. 1, pp. 314–317. San Antonio,
USA (2002)
27. Garg, H.: A hybrid PSO-GA algorithm for constrained optimization problems. Appl.
Mathem. Computat. 274, 292–305 (2016)
28. Dimopoulos, G.G.: Mixed-variable engineering optimization based on evolutionary and
social metaphors. Comput. Method Appl. Mechan. Eng. 196, 803–817 (2007)
29. Ratnaweera, A., Halgamuge, S.K., Watson, H.C.: Self-organizing hierarchical particle
swarm optimizer with time-varying acceleration coefficients. IEEE Trans. Evol. Comput. 8
(3), 204–255 (2004)
30. Liu, Y., Niu, B., Luo, Y.: Hybrid learning particle swarm optimizer with genetic disturbance.
Neurocomputing 151, 1237–1247 (2015)
31. Sheikhalishahi, M., Ebrahimipour, V., Shiri, H., Zaman, H., Jeihoonian, M.: A hybrid GA-
PSO approach for reliability optimization in redundancy allocation problem. Int. J. Adv.
Manuf. Technol. 68, 317–338 (2013)
32. Liu, L., Hu, R.S., Hu, X.P., Zhao, G.P., Wang, S.: A hybrid PSO-GA algorithm for job shop
scheduling in machine tool production. Int. J. Prod. Res. 53, 5755–5781 (2015)
33. Lim, W.H., Isa, N.A.M.: Particle swarm optimizations with dual-level task allocation. Eng.
Appl. Artif. Intell. 38, 88–110 (2015)
34. Abdelhalim, A., Nakata, K., El-Alem, M.: Eltawil, A 2017 Guided particle swarm
optimization method too solve general nonlinear optimization problems. Eng. Comput. 50,
568–583 (2017)
35. Wang, L., Li, L., Liu, L.: An effective hybrid PSOSA strategy for optimization and its
application to parameter estimation. Appl. Mathemat. Computat. 179, 135–146 (2006)
36. Liu, Fl, Zhou, Z.: And improved QPSO algorithm and its application in thee high-
dimensional complex problems. Chemomet. Intellig. Laborat. System 132, 82–90 (2014)
37. Shi, Y., Eberhart, R.C.: Fuzzy adaptive particle swarm optimization. In: Proceedings of the
IEEE Congress on Evolutionary Computation, vol. 1, pp. 101–106. IEEE, South Korea
(2001)
38. Khan, S.A., Engelbrecht, A.P.: A fuzzy particles swarms optimization algorithm for
computer communications network topology design. Appl. Intell. 36, 161–177 (2012)
39. Nobile, M., Cazzaniga, P., Besozzi, D., et al.: Fuzzy self-tuning PSO: at settings-free
algorithm for globals optimization. Swarm Evolution. Computat. 39, 70–85 (2018)
Statistics of Pareto Fronts
Mohamed Bassi1(&), Emmanuel Pagnacco1, Eduardo Souza de

Cursi1 , and Rachid Ellaia2
1
LMN, INSA Rouen Normandie, Normandie Université, 76801 St-Etienne du
Rouvray, France
bassi.mohamed@gmail.com, {emmanuel.pagnacco,souza}
@insa-rouen.fr
2
LERMA Laboratory, Engineering for Smart & Sustainable Systems Research
Center, E3S, Mohammed V University of Rabat, EMI, 10050 Rabat, Agdal,
Morocco
ellaia@emi.ac.ma
Abstract. We consider multiobjective optimization problems affected by

uncertainty, where the objective functions or the restrictions involve random
variables. We are interested in the evaluation of statistics such as medians,
quantiles and confidence intervals for the Pareto front. We present a method for
the determination of such statistics which is independent of the representation
used to describe the Pareto front. In a second step, we start from a sample of
Pareto fronts and we use a Generalized Fourier Series approach to generate a
larger sample of about 105 Pareto fronts with a reasonable computational cost.
These large samples are used to obtain more accurate statistics. Examples show
that the method is effective to calculate.
Keywords: Optimization under uncertainty Uncertainty quantification

Multiobjective optimization
1 Introduction
Real phenomena are affected by variabilities and uncertainties, so that a description con-
sidering uncertainties is more realistic than the one provided by deterministic models. In
practice, even numeric, model or implementation errors and inaccuracies are sources of
uncertainty and variability. Thus, Uncertainty Quantification (UQ) is a field of increasing
interest, namely in design, especially for the optimization of systems. In the context of
multiobjective optimization, UQ must manipulate objects such as curves, surfaces or, more
generally, varieties - which are objects belonging to infinite dimensional spaces [1]. The
designer may be interested in statistics of the Pareto fronts (such as, for instance, mean,
median, variance, confidence intervals) or their probability distribution: in any case, a sample
formed by a significant number of fronts must be generated to produce statistically reliable
results. Thus, many calculations are requested: it is necessary to generate a large number of
fronts, while each front is the result of an optimization procedure - which must be restarted at
each time. We present here a procedure for significantly reducing the computational effort,
based on the use of Hilbert approximations of the Pareto fronts, typical of chaos expansions.
We illustrate the approach by using 2D fronts, but it generalizes to higher dimension.
https://doi.org/10.1007/978-3-030-21803-4_55
548 M. Bassi et al.
In Sect. 2, we state the difficulties concerning statistics of infinite dimensional

objects by using examples of 2D curves, propose a solution using the Hausdorff
distance, and apply it to two multiobjective basic problems. In Sect. 3, we propose an
original approach to generate Pareto fronts using Generalized Fourier Series.
2 Statistics of Families of Curves

2.1 Problem Position
When considering statistics of curves, such as the determination of the mean curve
EðX ðUÞÞ, we face difficulties. The first one is the manipulation of probabilities in
infinite dimensional spaces: in order to properly give a signification to EðXÞ, we must
define a probability on the space V, which is an infinite dimensional one. Such a
difficulty may be solved by defining probabilities on the coefficients xi of the repre-
sentation of X: then, we may estimate the mean of the family by estimating the means
of the coefficients. Nevertheless, this approach has severe limitations: the result may be
a curve which is not a member of the family, it may depend on U. For instance, let us
consider U ¼ ðu1 ; u2 Þ, u1 U½1;3 ; u2 U½0;2p ; i ¼ 1; 2; . . .; ns and the families of
curves X ðtjU Þ ¼ ðX1 ; X2 Þs given by:
(
X1 ¼ u1 cosðtÞ X1 ¼ u1 cosðt þ u2 Þ X1 ¼ u1;i cos ð1Þi :t
; ; : ð1Þ
X2 ¼ u1 sinðtÞ X2 ¼ u1 sinðt þ u2 Þ X2 ¼ u1;i sin ð1Þi :t
All these parameterizations generate the same set of circles, but, as shown on Fig. 1,
the mean EðX ðtjU ÞÞ lead to different results:

EðX1 Þ ! Eðu1 Þ cosðtÞ EðX1 Þ ! 0 EðX1 Þ ! 0
; ; : ð2Þ
EðX2 Þ ! Eðu1 Þ sinðtÞ EðX2 Þ ! 0 EðX2 Þ ! Eðu1 Þ sinðtÞ
Fig. 1. Punctual means (red) furnished by three different representations of green circles
Statistics of Pareto Fronts 549
2.2 An Approach Independent from the Representation

An alternative approach is proposed by [5]: we can determine the element of the family
that minimizes the sum of the distances to the other elements - which corresponds to a
median. When the distance is independent of the parameterization, we obtain a result
that is also independent. Similarly, we can determine quantiles corresponding to the
elements closest to it, which allows the construction of confidence intervals. An
example of distance independent from the representation is the Hausdorff distance,
which is a distance between sets of points – which are identical whatever the
representation.
2.3 Application to Pareto Fronts

Let us consider the model problem:
8
>
< Minimize FðxjU Þ ¼ ðF1 ðxjU Þ; F2 ðxjU ÞÞ
x2Rd
: ð3Þ
> under the restrictions GðxjUÞ ¼ G1 ðxjU Þ; . . .; Gp ðxjUÞ 0
: ðx ; x Þ 2 Rd Rd and x x x
l u l u
This problem involves only inequality restrictions, but the approach applies also to the
general situation involving mixed equality/inequality restrictions. In order to evaluate
statistics of the Pareto front associated to this problem, we generate ns variates from the
random vector U : u1 ; u2 ; . . .; uns . For each variate ui , we find the Pareto front asso-
ciated to:
8
>
< Minimize FðxjU ¼ ui Þ ¼ ðF1 ðxjU ¼ ui Þ; F2 ðxjU ¼ ui ÞÞ
x2R
d
GðxjU ¼ ui Þ ¼ G1 ðxjU ¼ ui Þ; . . .; Gp ðxjU ¼ ui Þ 0 : ð4Þ

>
: u:r:
xl x x u
In our experiments, we used the variational approach introduced in [6, 7] to determine

the Pareto front Pi , but any available method may be used instead. A polynomial
family U is used, but any total family may be considered. Each ui generates a Pareto
front and we obtain ns fronts P ¼ fP1 ; P2 ; . . .; Pns g. Then, we determine the median
front Pm corresponding to the minimal sum of Hausdorff distances dH to the other
fronts in the set P: Pm is considered as the most representative member of the family P
– its median. The determination of Pm allows us organizing the elements of P in
function of their distance to it, so that we may find a x-quantile Px of a given confi-
dence level x. Finally, we can define a confidence interval at the same level x by:
Ix ¼ fPi jdH ðPi ; Pm Þ dH ðPi ; Px Þg: ð5Þ
It is interesting to notice that our experiments with other distances, such as, for
instance, dL2 and the modified Hausdorff distance dHM [6] led to the same results.
550 M. Bassi et al.
As an example, let us consider U ¼ ðu1 ; u2 ; u3 Þ, with three independent variables

uniformly distributed on ½0; 0:1 and the Fonseca-Fleming problem under uncertainty
given by:
8 8 2
> > P3
>
> >
< 1f ð x Þ ¼ 1 exp x 1ffiffi
p þ u
>
< i¼1 i 3 i
Minimize 2
:
x2R3 >
> P3 ð6Þ
>
> : f2 ð xÞ ¼ 1 exp i¼1 xi 3 uip1ffiffi
>
>
:
under the restrictions 4 xi 4 ; i 2 f1; 2; 3g
The results are exhibited on Fig. 2 where 200 Pareto fronts corresponding to a sample
size ns ¼ 200 are plotted. On Fig. 2, the median appears in red, the confidence interval
90% appears in cyan, while blue curves lay outside the confidence interval.
Fig. 2. Pareto fronts for the Fonseca-Fleming problem under uncertainty. The sample has size
ns ¼ 200. The median appears in red, the confidence interval 90% in cyan. Blue curves lay
outside the confidence interval.
As a second example, we consider the ZDT3 problem under uncertainty:

U ¼ ðu1 ; u2 Þ, with two independent variables uniformly distributed on ½0; 0:1 and
½0:15; 0:15, respectively:
8
>
< f1 ðxÞq¼ffiffiffiffiffiffiffi
x1 þ u1
f1 ðxÞ f1 ðxÞ
Minimize f ð x Þ ¼ g ð x Þ: 1 þ u sin ð 10pf ð x Þ Þ
x2Rn >
:
2 2 gðxÞ gðxÞ 1
ð7Þ
under the restrictions 0 xi 1 i 2 f1; 2g
for gðxÞ ¼ 1 þ 9ðx1 þ x2 Þ:
On Fig. 3, the median Pareto front is the red curve and the confidence interval at
90% is the set of cyan fronts, while the blue fronts are the remaining 10% elements of
the set and which are the farthest ones from the median front, in the sense of Hausdorff
distance.
Fig. 3. Pareto fronts for the ZDT3 problem under uncertainty. The sample has size ns ¼ 200.
The median appears in red, the confidence interval 90% in cyan. Blue curves lay outside the
confidence interval
3 Statistics of Pareto Fronts by Using Generalized Fourier

Series
The preceding examples show that the construction of the median and of the confidence
interval may request a large number of Pareto fronts, namely if a high confidence is
requested. But each Pareto front results from an optimization procedure and the whole
process may be expensive in terms of computational cost. To accelerate the procedure,
we may consider the use of Generalized Fourier Series (GFS): given a relatively small
sample of exact Pareto fronts, we may determine an expansion of the functions FðtjU Þ
corresponding to the Pareto fronts and use it to generate a much larger sample of
approached Pareto fronts.
552 M. Bassi et al.
The process involves two steps: The first one consists in approximating each exact
Pareto front by a polynomial of degree N 1, whose coefficients are exact too. In the
second step, another approximation of the random vectors of the exact coefficients is
made by using GFS. In the sequel, “ex” refers to “exact” and “app” refers to
“approached”.
For instance,
let us consider
the Pareto front as given by the equation
FðtjU Þ ¼ f1ex ðtjU Þ; f2ex ðtjU Þ : t 2 ð0; 1Þ. We may consider the expansion:
XN
fiex ðtjU Þ cex ðU ÞWj ðtÞ;
j¼1 ij
Wj ðtÞ ¼ tj1 : ð8Þ
The coefficients cex

ij depend on the variate and, so, are random variables. We may
determine a representation of cex
ij by GFS (see Appendix A):
app
Xnc
ij ðU Þ cij ðU Þ ¼
cex k¼0
dijk Uk ðU Þ; ð9Þ
involving a degree of approximation nc and a polynomial family U, and use this

approximation to obtain more accurate statistics of the Pareto fronts.
Let us illustrate the approach by using the Fonseca-Fleming problem: we generate
ns ¼ 1000 exact Pareto fronts and we approach each as shown
in Eq. (11) with N ¼ 7,
then we obtain 1000 instances of the random variables cex
ij 1 i 2 .
1j7
Let us denote em the mean error between the approximation of the Pareto fronts
using cex
ij and the exact ones. We have:
0 h P i1
j1
E f1ex ðtjU Þ Nj¼1 cex ð U Þt 6:4 103
em ¼ @ h
1j
1i A ¼
P j1 6:4 103
E f2ex ðtjU Þ Nj¼1 cex
2j ðU Þt
1
where k:k1 refers to the norm 1 defined on Rn by: kxk1 ¼ supfjxi j; 1 i ng.
Now, we generate a sample of 1000 values of capp ij resulting from the GFS
ex
expansion of cij and we construct 1000 approached Pareto fronts given by:
XN
fiapp ðtjU Þ ¼ capp ðU Þtj1 ; t
j¼1 ij
2 ð0; 1Þ: ð10Þ
Now, after having compared two samples of the same size, according to the values in
Tables 1 and 2, we generate 105 size sample of capp that allows us building 105 of
app ij
approached Pareto fronts f1 ðtjU Þ; f2app ðtjU Þ : t 2 ð0; 1Þ.
In Fig. 4, we present in cyan the new set of 105 of the approached Pareto fronts of
Fonseca and Fleming and in black, the mean Pareto front Papp m resulting from the
( ! )
P app j1 P app j1
N N
app
means of cij 1 i 2 such as: Pm ¼ app
c1j :t ; c2j :t ; t 2 ð0; 1Þ .
j¼1 j¼1
1j7
app
Table 1. Relative errors between the correlation coefficients of 1000 values of cex
ij and cij .
0 −5,31835E −4,9891E −4,47895E −2,87411E −2,48081E −2,47322E −5,15009E −3,45621E −1,99118E −6,84894E −3,09873E −2,46507E −2,47294E
−10 −08 −08 −08 −08 −08 −10 −09 −09 −08 −08 −08 −08
−5,31835E 0 −4,44304E −3,84528E −2,35116E −1,98608E −1,97139E −1,35588E −5,64872E −9,6811E −8,32324E −2,50653E −1,95474E −1,97111E
−10 −08 −08 −08 −08 −08 −09 −09 −10 −08 −08 −08 −08
−4,9891E−08 −4,44304E 0 −1,36275E −4,58613E −6,31702E −6,63513E −4,02546E −5,75829E −3,2258E −2,38264E −5,82029E −7,04689E −6,63732E
−08 −09 −09 −09 −09 −08 −08 −08 −07 −09 −09 −09
−4,47895E −3,84528E −1,36275E 0 −1,83409E −3,0819E −3,19753E −3,71717E −5,43515E −2,76388E −2,35483E −1,93134E −3,36212E −3,19907E
−08 −08 −09 −09 −09 −09 −08 −08 −08 −07 −09 −09 −09
−2,87411E −2,35116E −4,58613E −1,83409E 0 −1,61921E −1,98718E −2,34201E −3,79573E −1,54351E −1,98891E −2,90146E −2,64535E −1,99221E
−08 −08 −09 −09 −10 −10 −08 −08 −08 −07 −10 −10 −10
−2,48081E −1,98608E −6,31702E −3,0819E −1,61921E 0 −7,22795E −2,02017E −3,39841E −1,26035E −1,89211E −4,6475E −3,62969E −7,40883E
−08 −08 −09 −09 −10 −12 −08 −08 −08 −07 −10 −11 −12
−2,47322E −1,97139E −6,63513E −3,19753E −1,98718E −7,22773E 0 −2,03137E −3,41113E −1,2561E −1,89763E −4,203E−10 −1,14511E −1,16684E
−08 −08 −09 −09 −10 −12 −08 −08 −08 −07 −11 −13
−5,15009E −1,35588E −4,02546E −3,71717E −2,34201E −2,02017E −2,03137E 0 −4,57646E −1,32851E −8,20559E −2,66403E −2,04664E −2,03133E
−09 −09 −08 −08 −08 −08 −08 −09 −09 −08 −08 −08 −08
−3,45621E −5,64872E −5,75829E −5,43515E −3,79573E −3,39841E −3,41113E −4,57646E 0 −7,5895E −6,87324E −4,15572E −3,41819E −3,40736E
−09 −08 −08 −08 −08 −08 −08 −09 −09 −08 −08 −08 −08
−1,99118E– −9,6811E −3,2258E −2,76388E −1,54351E −1,26035E −1,2561E −1,32851E −7,5895E 0 −1,00573E −1,72107E −1,25239E −1,25595E
08 −10 −08 −08 −08 −08 −08 −09 −09 −07 −08 −08 −08
−6,84894E −8,32324E −2,38264E −2,35483E −1,98891E −1,89211E −1,89763E −8,20559E −6,87324E −1,00573E 0 −2,10434E −1,90461E −1,89766E
−08 −08 −07 −07 −07 −07 −07 −08 −08 −07 −07 −07 −07
−3,09873E −2,50653E −5,82029E −1,93134E −2,90146E −4,64749E −4,203E−10 −2,66403E −4,15572E −1,72107E −2,10434E 0 −3,83583E −4,19007E
−08 −08 −09 −09 −10 −10 −08 −08 −08 −07 −10 −10
−2,46507E −1,95474E −7,04689E −3,36212E −2,64535E −3,6297E −1,14512E −2,04664E −3,41819E −1,25239E −1,90461E −3,83583E 0 −1,09187E
−08 −08 −09 −09 −10 −11 −11 −08 −08 −08 −07 −10 −11
−2,47294E −1,97111E −6,63732E −3,19907E −1,99221E −7,40883E −1,16684E −2,03133E −3,40736E −1,25595E −1,89766E −4,19007E −1,09187E 0
−08 −08 −09 −09 −10 −12 −13 −08 −08 −08 −07 −10 −11
Statistics of Pareto Fronts
553
554 M. Bassi et al.
app
Table 2. Relative errors between the 4 first moments of 1000 values of cex
ij and cij .
Moyenne 2,41.10−15 2,78.10−16 −3,4.10−16 1,79.10−15 −2,3.10−15 −1,6.10−15 −4,2.10−16

5,92.10−16 1,83.10−15 1,06.10−15 −6,4.10−16 1.10−15 0 −1,4.10−16
Variance 8,86.10−09 4,96.10−09 5,22.10−08 4,27.10−08 2,14.10−08 1,67.10−08 1,65.10−08
6,55.10−09 2,21.10−08 9,64.10−10 2,32.10−07 2,4.10−08 1,64.10−08 1,65.10−08
Kurtosis −2,8.10−05 −3,1.10−06 −1,1.10−04 −7,3.10−05 −3,3.10−05 −2,1.10−05 −1,5.10−05
3,03.10−05 5,69.10−05 5,67.10−06 3,22.10−04 1,13.10−05 −7,2.10−06 −1,5.10−05
Skweness −5,6.10−06 4,97.10−06 −2,9.10−05 −1,3.10−05 −2,8.10−06 2,5.10−06 9,66.10−06
8,94.10−06 1,67.10−05 2,07.10−06 11,3.10−05 −4,2.10−05 3,55.10−05 9,66.10−06
The method presented here allows computing very large samples of a given random
object at a very low computational cost, and leads then to a better estimation of their
statistical characteristics. Note that the calculation of a 105 size sample of exact Fon-
seca and Flemings Pareto fronts lasts about 215 h, be it 9 days of calculation, while the
sample in Fig. 4 took few seconds to be generated.
Fig. 4. 105 of approached Fonseca and Fleming’s Pareto fronts (cyan) and their mean (black)
obtained by using GFS expansions
4 Concluding Remarks
We considered multiobjective optimization problems involving random variables. In

such a situation, the Pareto front is a random curve, which is an object belonging to an
infinite dimensional space, so that the evaluation of statistics faces some difficulties. At
first, a precise definition of the mean and the variance involves the definition of
probabilities in an infinite dimensional space. In addition, the practical evaluation

requests the numerical manipulation of those probabilities. In order to solve these
difficulties, we introduced a procedure which is based on the manipulation of samples,
which is independent from the basis used to describe the space. The accuracy of the
statistics requests large samples, so that we introduced a method for the generation of
large samples by using a relatively smaller medium size sample and a representation of
the Pareto fronts by Generalized Fourier series. The tests performed with a large
number of classical multiobjectif test problems led to excellent results. Further
development will concern the use of small samples and robust multiobjective
optimization.
Appendix A: Generalized Fourier Series
In the framework of UQ, we are interested in the representation of random variables: let
us consider a couple of random variables ðU; X Þ, such that X ¼ X ðU Þ, that is X is a
function of U. If X 2 V, where V is a separable Hilbert space, associated to the scalar
product ð
;
Þ, we may consider a convenient Hilbert basis (or total family) U ¼
fui gi2N and look for a representation X given by [2]:
X
X ¼ X ðU Þ ¼ i2N
xi ui ðUÞ: ð11Þ

If the family is orthonormal, ui ; uj ¼ dij and the coefficients of the expansion are
given by xi ¼ ðX; ui ðU ÞÞ. Otherwise, we may consider the approximations of X by
finite sums:
X
X Pn X ¼ xi ui ðU Þ: ð12Þ
1in
In thiscase, the coefficients xi are the solutions of the linear system Ax ¼ B, where
Aij ¼ ui ; uj and Bi ¼ ðX; ui Þ. We have:
lim Pn X ¼ X: ð13Þ
n!1
In UQ, the Hilbert space V is mainly L2 ðX; PÞ, where X Rn and P is a probability
measure, with ðY; ZÞ ¼ EðYZÞ. Classical families U are formed by polynomials,
trigonometric functions, Splines or Finite Elements approximations. Examples of
approximations may be found in the literature (see, for instance, [2, 3]). When X is a
function of a second variable – for instance, t – we denote the function X ðtjU Þ and we
have:
556 M. Bassi et al.
X X
XðtjU Þ ¼ xi ðtÞui ðU Þ Pn X ðtjUÞ ¼ xi ðtÞui ðU Þ: ð14Þ
i2N 1in
The reader may refer to [4] to get more information and MATLAB codes for the
evaluation of the coefficients xi , namely in multidimensional situations. In practice, we
use a sample from ðtjU Þ : X ðtjU1 Þ; . . .; X ðtjUns Þ in order to evaluate the means forming
A and B.
References
1. Croquet, R., Souza de Cursi, E.: Statistics of uncertain dynamical systems. In: Topping, B.H.
V., Adam, J.M., Pallarés, F.J., Bru, R., Romero, M.L. (eds.) Proceedings of the Tenth
International Conference on Computational Structures Technology, Paper 173, Civil-Comp
Press, Stirlingshire, UK (93), pp. 541–561. https://doi.org/10.4203/ccp.93.173 (2010)
2. Bassi, M., Souza de Cursi, E., Ellaia, R.: Generalized fourier series for representing random
variables and application for quantifying uncertainties in optimization. In: 3rd International
Symposium on Uncertainty Quantification and Stochastic Modeling, Maresias, SP, Brazil,
15–19 February (2016). http://www.swge.inf.br/PDF/USM-2016-0037_027656.PDF
3. Bassi, M.: Quantification d’Incertitudes et Objets en Dimension Infinie. Ph.D. Thesis, INSA
Rouen Normandie, Normandie Université, Saint-Etienne du Rouvray (2019)
4. Souza de Cursi, E., Sampaio, R.: Uncertainty Quantification and Stochastic Modelling with
Matlab. ISTE Press, London, UK (2015)
5. Bassi, M., Souza de Cursi, E., Pagnacco, E., Ellaia, R.: Statistics of the pareto front in multi-
objective optimization under uncertainties. Lat. Am. J. Solids Struct. 15(11), e130. Epub
November 14, 2018. https://doi.org/10.1590/1679-78255018 (2018)
6. Dubuisson, M., Jain, A.K.: A modified hausdorff distance for object matching. In:
Proceedings of 12th International Conference on Pattern Recognition, October 9–13,
Jerusalem, pp. 566–568, https://doi.org/10.1109/icpr.1994.576361 (1994)
Uncertainty Quantification in Optimization
Eduardo Souza de Cursi1 and Rafael Holdorf Lopez2(&)

1
LMN, INSA Rouen Normandie, Normandie Université, 76801 St-Etienne du
Rouvray, France
souza@insa-rouen.fr
2
UFSC, Campus Trindade, Florianopolis SC 88040-900, Brazil
rafael.holdorf@ufsc.br
Abstract. We consider constrained optimization problems affected by uncer-

tainty, where the objective function or the restrictions involve random variables
u. In this situation, the solution of the optimization problem is a random variable
xðuÞ: we are interested in the determination of its distribution of probability. By
using Uncertainty Quantification approaches, we may find an expansion of xðuÞ
in terms of a Hilbert basis F ¼ fui : i 2 N g. We present some methods for the
determination of the coefficients of the expansion.
Keywords: Optimization under uncertainty Uncertainty quantification

Constrained optimization
1 Introduction
Uncertainties are a key issue in engineering design: optimal solutions usually imply no
safety margins, but real systems involve uncertainty, variability and errors. For
instance, geometry, material parameters, boundary conditions, or even the model itself
include uncertainties. To provide safe designs, uncertainty must be considered in the
design procedure – so, in optimization procedures.
There are different ways to introduce uncertainty in design: the most popular ones
are interval methods, fuzzy variables and probabilistic modeling. Each approach has its
particularities, advantages and inconveniences. Here, we focus on the probabilistic
approach, which is used in the situations where quantitative statistical information
about the variability is available - fuzzy approaches are often used when the infor-
mation about uncertainty is qualitative and interval approaches do not require infor-
mation about statistical properties of the uncertainties.
When using the probabilistic approach, the variability is modeled by random vari-
ables. In general, the only assumption about the distributions the random parameters is the
existence of a mean and a variance, id est, that the random variables are square summable.
The distribution is generally calculated: it is one of the unknowns to be determined.
For instance, let us consider the model problem
x ¼ Arg Min fF ðy; uÞ : y 2 SðuÞg; ð1Þ

https://doi.org/10.1007/978-3-030-21803-4_56
558 E. S. de Cursi and R. Holdorf Lopez
where U is a random vector. Thus, the optimal solutions x may be sensitive to the
variations of U. In the case of a significant variability of u, standard optimization
procedures cannot ensure a requested safety level: for each possible value u ¼ uðxÞ,
the solution takes the value xðxÞ ¼ xðuðxÞÞ, so that x is a random variable. The
determination of its statistical properties and its distribution are requested to control
statistical properties or the probabilities of some crucial events, such as failure.
The reader may find in the literature different approaches used to guarantee the
safety of the solution: sensitivity analysis, robust optimization approaches, reliability
optimization, chance-constrained optimization, value at risk analysis. None of these
approaches furnishes the distribution of x : it is necessary to use Montecarlo simulation
or Uncertainty Quantification (UQ) approaches – in general, Montecarlo requires a
larger computational effort, while UQ approaches are more economical.
In a preceding work [1], we considered the determination of the distribution of x in
unconstrained optimization. The results extend straightly to the situation where S is
defined by inequalities:
S ¼ fy 2 Rn : gi ðy; uÞ 0; 1 i p; hi ðy; uÞ ¼ 0; 1 i qg: ð2Þ
2 UQ Methods for the Determination of the Unknown

Coefficients
Applying the UQ approach, we look for a representation of x ¼ x ðuÞ, given by

X
x¼ xi ui ðuÞ;
i2N
where F ¼ fui : i 2 N g is a Hilbert basis or a total family. In practice, we consider

the approximations of x by finite sums:
X
x Px ¼ xi ui ðuÞ
1 i Nx
Let us introduce

uðuÞ ¼ u1 ðuÞ; . . .; uNx ðuÞ

and a matrix X ¼ xij : 1 i NX ; 1 j n such that its line i contains the compo-
nents of xi :
Xij ¼ ðxi Þj ; i:e:; xi ¼ ðXi1 ; . . .; Xin Þ:
Then, Px ¼ uðuÞ:X. The unknowns to be determined are the elements of the matrix
X. In the sequel, we examine some methods for the determination of X.
Uncertainty Quantification in Optimization 559
2.1 Collocation

When a sample xk ; uk : 1 k ns of ns variates from the pair ðx; uÞ is available,
we may consider the system of linear equations given by:

u uk :X ¼ xk
This system involves ns n equations for NX n unknowns. For ns NX , it is

overdetermined and admits a generalized solution, such as a least-squares one, which
may be determined in order to furnish X and, so, Px.
We may illustrate this approach by using the simple situation where x is the
solution of the Rosenbrock’s problem (assume that both u1 ; u2 are independent and
uniformly distributed on ð1; 2Þ) (See Figs. 1, 2, 3, 4 and 5).
n o
x ¼ Arg Min ð1 u1 y1 Þ2 þ 100ððu1 þ u2 Þy2 u1 y1 Þ2 : y 2 R2 ; y
Fig. 1. Results for a random sample of 8 8 values of u.
The solution is
X1 ¼ 1=u1 ; X2 ¼ 1=ðu1 þ u2 Þ
Let us consider a polynomial approximation of order 2:
u1 ðuÞ ¼ 1; u2 ðuÞ ¼ u1 ; u3 ðuÞ ¼ u2 ; u4 ðuÞ ¼ u21 ; u5 ðuÞ ¼ u1 u2 ; u6 ðuÞ ¼ u22
2.2 Variational Approximation

The variational approach solves the orthogonal projection equation
Fig. 2. Results for an uniform mesh e of 8 8 values of u
Fig. 3. Results for an error of 5% in the values of xk .
Fig. 4. Results for an error in the distribution of u (N(1.5, 0.25))

Fig. 5. Exact values
Eðyt PxÞ ¼ E ðyt xÞ; 8y ¼ uðuÞ:Y
We have

Eðyt PxÞ ¼ Y t E uðuÞt uðuÞ X; E ðyt xÞ ¼ Y t E uðuÞt x :
Thus, the coefficients X are the solution of the linear system (See Figs. 6, 7, and 8).

E uðuÞt uðuÞ X ¼ E uðuÞt X :
Fig. 6. Results for n ¼ z and a random sample of 8 8 values of u.
Fig. 7. Results for an error of 5% in the values of xk .

2.3 Moment Matching

An alternative solution consists in determining the unknown coefficients in order to fit
the empirical moments of the data. Let us denote by
1X ns
Mke1 ...kn ¼ xki11 . . .xkinn E xki11 . . .xkinn
ns i¼1
the empirical moment of order k ¼ ðk1 ; . . .; kn Þ of the components

j associated to the

sample. When considering 0 ki KM moments, we set M ¼ Mke1 ...kn ; 0 ki KM .
e
Into an analogous way, we may generate M ðX Þ ¼ ðMk1 ...kn ðX Þ; 0 ki KM Þ such that
1 Xns k1
Mk1 ...kn ðX Þ ¼ Pxi1 . . .Pxkinn E Pxki11 . . .Pxkinn ; Pxi ¼ uðui Þ:X
ns i¼1
Then, we may look for the coefficients verifying
M ðXÞ ¼ M e ; i:e:; Mk1 ...kn ðXÞ ¼ Mke1 ...kn ; 0 ki KM :
These equations form a nonlinear system of ðKM þ 1Þn equations which must be
solved for the n Nx unknowns X by an appropriated method. If the number of
equations exceeds the number of unknowns, an alternative consists in minimizing a
pseudo-distance distðM ðX Þ; M e Þ. The main difficulty in this approach is to obtain a
good quality result in the numerical solution), due to the lack of convexity the mini-
mization of distðM ðX Þ; M e Þ is a global optimization problem.
Let us illustrate this approach by using the Rosenbrock’s function. Let n ¼ z and
consider a sample of 64 values of z, corresponding to 8 random values of each variable
zi . We consider KM ¼ 5 and we minimize the mean square norm kM ðX Þ M e k. For X
exactly determined, we obtain a relative error of 1.0%. By using an uniform grid of
88 values of z, the relative error is 0:8%. When 5% errors are introduced in the values
of X, the relative error is 1.0% for a sample of random values and 2.0% for an uniform
grid. When considering u as a pair of independent normal variables having mean 1.5
and standard deviation 0.25, the relative error is 1.6%. An example of result is shown in
Fig. 9.
2.4 Adaptation of an Iterative Method

Assume that the deterministic optimization problem (1) for a fixed U can be solved by
an iterative numerical method having as iteration function W: a sequence of values
ð pÞ
X : p 0 is generated starting from an initial guess X ð0Þ by the iterations

xðp þ 1Þ ¼ W xð pÞ
Introducing the finite approximations, we have

Pxðp þ 1Þ W Pxð pÞ :
Since
Pxð pÞ ¼ uðuÞ:Xð pÞ ; Pxðp þ 1Þ ¼ uðuÞ:Xðp þ 1Þ
and solve the variational equations:

E yt PX ðp þ 1Þ ¼ E yt W PX ðpÞ ; 8y ¼ uðuÞ:Y
Thus

E uðuÞt uðuÞ Xðp þ 1Þ ¼ E uðuÞt W uðuÞ:Xð pÞ
The solution of this linear system determines X ðp þ 1Þ and, thus, Pxðp þ 1Þ . A particularly
useful situation concerns iterations where
WðX Þ ¼ X þ UðXÞ
In this case, the iterations read as
Xðp þ 1Þ ¼ Xð1Þ þ DX ð pÞ ;
where

E uðuÞt uðuÞ DXðpÞ ¼ E uðuÞt U uðuÞ:XðpÞ :
This approach may be used, for instance, when an implementation of a descent method
for the problem (1) is available – such as, for example, a code implementing the
projected gradient descent. Then, the code furnishes W and we may adapt it to
uncertainty quantification (See Figs. 10, 11 and 12).
Fig. 10. Results for stochastic descent with an error in the distribution of u (N(1.5, 0.25))
Fig. 11. Results for Robbins-Monto iterations with an error in the distribution of u (N(1.5,
0.25)), degree 4
2.5 Optimality Equations

If the solution x verifies optimality equations:
EðxÞ ¼ 0;
such as, for instance, rF ðxÞ ¼ 0, then we may use methods for uncertainty quantifi-
cation of algebraic equations (see, for instance, [2, 3]). This approach may be applied
when q ¼ 0 (only inequalities, no equalities), In this case, we may consider

mðt; uÞ ¼ minn fr ðy; t; uÞg; r ðy; t; uÞ ¼ max F ðy; uÞ t; g1 ðy; uÞ; . . .; gp ðy; uÞ :
y2R
Assume the continuity of F. Then, on the one hand,
y 62 SðuÞ ) r ðy; t; uÞ [ 0;
on the other hand, for y 2 SðuÞ:
t [ F ðx; uÞ ) r ðy; t; uÞ\0; t\F ðx; uÞ ) r ðy; t; uÞ [ 0:
It results from these inequalities that m has a zero at the point t ¼ F ðx; uÞ. Thus, an
alternative approach consists in determining a zero t of m (i.e., mðt ; uÞ ¼ 0). Then
x ¼ arg minn fr ðy; t ; uÞg.
y2R
(a) Approximated values
(b) Exact values
Fig. 12. Results furnished by adaptation of Newton’s iterations of optimality equations

We have considered optimization under uncertainty modeled by random variables,

which corresponds to situations where statistical models of the uncertain parameters are
available. In such a situation, the distribution of the solution x may be determined by
different methods, such as, for instance, collocation, moment matching, variational,
adaptation or algebraic equation ones. The main difficulty in the practical implemen-
tation is the significant increase in the number of variables: when considering n
unknowns, we must determine n Nx coefficients. This is a limitation, namely, for
large n.
The methods have been tested in a simple but significant situation involving a non-
convex objective function and numerical difficulties in the optimization itself. In the
simple situation, the methods have shown to be effective to calculate. The simplest
approach is furnished by collocation, but it is more sensible to measurement errors and
may request stabilization or regularization (such as Tikhonov’s one). Moment matching
leads to global optimization problems, but is less sensitive to measurement errors,
including those on the distribution of the random variables. Variational, and adaptation
approaches are stable, but less efficient than collocation in situations where the data is
of good quality. The adaptation of deterministic procedures has led to the best results,
namely the algebraic approaches for the solution of optimality equations. Questions for
future work involve the development of methods for large values of n, the generation of
samples for arbitrary distributed v (or n), namely for specified correlations between
their components, high-order expansions in a large number of variables.
References
1. Lopez, R.H., De Cursi, E.S., Lemosse, D.: Approximating the probability density function of
the optimal point of an optimization problem. Eng. Optim. 43(3), 281–303 (2011) https://doi.
org/10.1080/0305215x.2010.489607
2. Lopez, R.H., Miguel, L.F.F., De Cursi, E.S.: Uncertainty quantification for algebraic systems
of equations. Comput. Struct. 128, 189–202 (2013) https://doi.org/10.1016/j.compstruc.2013.
06.016
3. De Cursi, E.S., Sampaio, R.: Uncertainty quantification and stochastic modelling with matlab.
ISTE Press, London, UK (2015)
Uncertainty Quantification in
Serviceability of Impacted Steel Pipe
Renata Troian(B) , Didier Lemosse, Leila Khalij, Christophe Gautrelet,

Normandie universite, LMN/INSA de ROUEN, 76000 Rouen, France

renata.troian@insa-rouen.fr
https://www.insa-rouen.fr/recherche/laboratoires/lmn
Abstract. The problem of the vulnerability of structures facing explo-

sions came to the front line of the scientific scene in the last decades.
Structural debris usually present dangerous potential hazard, e.g. domino
accident. Deterministic models are not sufficient for reliability analysis
of structures impacted by debris. Uncertainty of the environmental con-
ditions and material properties have to be taken into account. The pro-
posed research is devoted to the analysis of a pipeline behavior under
a variable impact loading. Bernoulli beam model is used as a structural
model of a pipeline for the case simplicity, while the different formula-
tion for impact itself are studied to simulate the wide range of possible
types of debris. Model sensitivity is studied first. The influence of input
parameters on structural behavior, that are the impact force, duration
and position, as well as beam material are considered. Uncertainty anal-
ysis of several impacts are then presented. The obtained insights can
provide the guidelines for the structure optimization under the explosive
loading taking into account the uncertainties.
Keywords: Impact · Rigid · Soft · Sensitivity · Uncertainty
1 Introduction
To assure the urban security in the case of an explosion efforts have to be made
in developing reliability analysis and design methods. Research efforts, stimu-
lated by industrial needs, are still required to achieve this goal. Yi Zhu et al.
in [21] have analyzed oil vapor explosion accident, various causes led to the
explosion, high casualties and severe damages. They are mentioning that debris
usually present dangerous potential hazard, e.g. domino accident. Among possi-
ble affected structures pipelines can play a major role in the domino effect. This
consideration defines the object of the present research. Prediction of a debris
This research is a part of a project AMED, that has been funded with the support
from the European Union with the European Regional Development Fund (ERDF)
and from the Regional Council of Normandie.
https://doi.org/10.1007/978-3-030-21803-4_57
568 R. Troian et al.
impact on pipes needs a detailed understanding of impact phenomena, struc-

tural and material response of pipes, and, of course, uncertainties consideration.
Structural design by safety factors using nominal values without considering
uncertainties may lead to designs that are either unsafe, or too conservative and
thus not efficient.
The existing literature treats various aspects of the risks connected to the
effects of explosions on structural integrity. For example, Kelliher and Sutton-
Swaby in [7] present a model for a stochastic representation of blast induced
damage in a concrete building. In the research of Hadianfard et al. [6] a steel
column under blast load is investigated using the Monte-Carlo method to take
into account the uncertainties of the blast and the material. Nevertheless, works
concerning variability, reliability or uncertainty remain rare. Regarding the spe-
cific analysis of pipes or, more generally, cylindrical structures destined to the
transportation of fluids, we may find works on the same topics, analogously
limited to deterministic situations.
While modeling the pipe’s dynamic response one can’t use anymore the mate-
rial models developed for isotropic materials, as the pipe’s construction material
evolved much lately. Economic studies have shown that development of oil and
gas transportation over long distances requires the use of high-strength grade
steels because their tensile properties allow to substantially increase the internal
pressure for a given pipe thickness [15]. In order to obtain high strength, these
materials are produced using complex Thermo-Mechanical-Control-Processing
(TMCP) which introduces preferred orientations within the steel and leads to
anisotropic plastic properties in higher grades. Rupture properties may also be
anisotropic [14]. Fully understanding and describing the material behavior is
needed to produce safe and cost-effective pipelines.
However, the pipe material properties can be controlled better comparing
to possible debris’ characteristics, see for instance works on statistic debris dis-
tribution [19,21]. Thus the attention has to be paid to a variable impactors
modeling. Also Villavicencio and Soares [18] showed that the dynamic response
of a clamped steel beams struck transversely at the center by a mass is very sen-
sitive to the way in which the supports are modelled. So the boundary conditions
of a structure are to be studied as well.
Numerous experimental and numerical studies of an explosion or an impact
on a cylinder have been carried out. But very few study concern the stochas-
tic systems. Among them can be mentioned recent study of Wagner et al. [20]
concerning robust design criterion for axially loaded cylindrical shells. Alizadeh
et al. in [2] studied pipe conveying fluid with stochastic structural and fluid
parameters. Speaking about impact problems in general, existing studies are
considering structure geometry, material and impactor velocity. Li and Liu in
[8] studied a phenomenon of elastic–plastic beam dynamics keeping the geo-
metrical and material parameters of a beam fixed and applying uncertainty on
the pressure amplitude, i.e. impact characteristic. Rina et al. in [13] proposed
a model to predict the penetration depth of a projectile in a two- and three-
layer target both deterministically and probabilistically. Material properties and
Uncertainty Quantification in Serviceability of Impacted Steel Pipe 569
projectile velocity are considered in probabilistic analysis. In [4] Antoine and

Batra have analyzed the influence of an impact speed, laminate material and
geometrical parameters on the plate response during the low velocity impact by
a rigid hemispherical-nosed cylinder. In [5] Fyllingen et al. conducted stochastic
numerical and experimental tests of square aluminium tubes subjected to axial
crushing. Considered uncertain parameters were geometry of a tube (an extru-
sion length and a wall thickness) and an impact velocity of the impactor. Lönn
et al. [9] presented approach to robust optimization of an aluminium extrusion
with quadratic cross-section subjected to axial crushing taking into account the
geometry uncertainty.
a) General view b) Cross-section
Fig. 1. Beam subjected to a single impact.
We aim the present research to demonstrate the influence of structure and

impactor parameters on a structural response. Young modulus is chosen for
material characterization, position of an impact on a structure corresponds to
interaction structure-impact and special attention is given for impactor charac-
teristics. Not only its velocity and mass are considered, but also the material
of an impactor. For this we propose a simplified modeling of a pipe under an
impact loading suitable for stochastic simulations. Impact is introduced into the
model as a pulse of sinusoidal shape. First a sensitivity of a model versus loading
parameters and pipe material is studied. Then a dynamic response of a pipe to
several impacts is considered.
2 Numerical Model of a Pipe Under Variable Impactors

In this research we are interested in the response of a pipeline to the impact
taking into account uncertainties. We propose a simplified model, that includes
a structure and numerous random impactors and won’t demand computational
time. The choice is to represent the impactors by a contact-force history that
act on a pipe.
2.1 Pipe Modeling

For a pipe simulation an elastic Bernoulli beam finite-element model was devel-
oped in Matlab with Newmark time integration scheme.
The perfectly clamped hollow cylindrical steel beam is shown in Fig. 1 (a,b).
The characteristics are the length L = 1 m, the diameter d = 0.1 m and the
thickness r = 0.02, Young modulus E = 2.158e11 Pa, density ρ = 7966 kg/m3
and yield stress σy = 2.5e8 Pa.
2.2 Limit State Criterion

Depending on the failure cause different limit states are associated to structural
elements of the linear pipelines parts [17]. Parameters such as a depletion of
strength under a force impact and a depletion of pipe material plasticity can be
taken into account for impacted pipelines risk assessment. In the present study
an elastic limit is chosen for the analyses as a serviceability limit state. Being too
conservative for a design of most pipelines due to the capacities of the elastic–
plastic range, it can be reasonable for the dangerous sites where a Domino effect
is highly possible. According to these considerations only stresses in the elastic
domain are calculated and stresses with σ > σy are considered as a failure.
2.3 Impactor Modeling
The interest of the work is to study the response of a structure to a stochastic

impact loading, including the impact of solid and soft debris, as both types of the
debris can be produced during an accident and can affect the structure’s integrity.
The characteristics of the loading force history such as shape, force and duration
are thus crucial. To obtain the idea of the realistic contact force estimation for
solid debris impact the paper of Perera et al. [11] can be considered. The model
presented in this paper enables the value of the peak contact force generated by
the impact of a piece of debris to be predicted. Results of calculations employing
the derived relationships have been verified by comparison with experimental
results across a wide range of impact scenarios. The observed contact load has
sinusoidal shape, that will be considered in the present article.
Following [12], when the load shape is defined by its effective load (amplitude)
h and duration l, the corresponding sinusoidal impulse shape, that will provide
the same structural response according to the pulse approximation method, will
π
be given by load = (π − 2)h sin(2ωt), for 0 < t < 2ω , with ω = π−2
l .
Duration time is the main difference between rigid and soft contacts, as are
formulated in [3]. A comprehensive overview of soft impact models is done by
Serge Abrate [1] for three types of soft projectile - liquid, bird and hailstone. Thus
the rigid and soft impact impulses are introduced into the model by considering
impulse of different time duration.
The developed numerical model of a beam under an impact is valid for
the case of the elastic material, so the impactor’s characteristics have to be
in the ranges, which will induce stresses in the beam that won’t exceed the
yield stress σy = 2.5e8 Pa. Parametric study was conducted with varying ampli-
tude and duration. Obtained maximum stresses σx for each pair of parameters
(amplitude, duration) = (h, l) are presented in Fig. 2. Only values of stresses
in the elastic domain are marked with color. When impact position p = 0.1
m normal stresses are smaller than for p = 0.5, as it was expected. The input
parameters of the system that are considered to keep it in elastic domain are
chosen following the values in Fig. 2 (colored stresses ) and are given in the
Table 1.
a) Impact is applied in a mid span, p = 0.5 b) Impact is applied at p = 0.1 m.

m.
Fig. 2. Maximum stresses values. Stresses values in an elastic domain are given by a
colorbar. The white area corresponds to a plasticity domain.
Table 1. Numerical values of parameters used in the simulation
Input parameters Val. min Val. max

Impact amplitude h (N) 0 1.5e5
Impact duration l (s) 0.0001 0.004
Impact position p (m) 0.05 0.5
3 Sensitivity Analysis of Impactor and Pipe

Characteristic
The study is concentrated on the stochastic nature of the impact. The charac-
teristics of the impact that will be studied are the impact force, duration and
position. The variability of the structure material will be considered through
variation of the Young modulus E.
Parameters are supposed to be independent and have uniform distribution
in the limits given in the Table 1 for impulse characteristics. Considering Young
modulus, the material of a pipe is assumed to be known, but it can vary slightly
due to manufacturing or aging. To take this into account E has uniform distribu-
tion in a range [1.95e11; 2.2e11] Pa. The sample of a size N = 1400 is obtained
with Latin hypercube sampling (LHS) [10].
The output parameters of the model that are influenced by the impact char-
acteristics and are important for the evaluation of the structure integrity are
the maximum beam deflection wmax and maximum stresses σmax together with
deflection wp and stresses σp at the impact position.
While there are many methods available for analyzing the decomposition of
variance as a sensitivity measure, the method of Sobol [16] is one of the most
established and widely used methods and is capable of computing the Total
Sensitivity Indices (TSI), which measures the main effects of a given parameter
and all the interactions (of any order) involving that parameter. Sobol’s method
uses the decomposition of variance to calculate the sensitivity indexes. The basis
of the method is the decomposition of the model output function y = f (x)
into summands of variance using combinations of input parameters in increasing
dimensionality.
The first-order index Si represents the share of the output variance that
is explained by the considered parameter alone. Most important parameters
therefore have high index, but a low one does not mean the parameter has no
influence, as it can be involved in interactions. The total index SItot is a measure
of the share of the variance that is removed from the total variance when the
considered parameter is fixed to its reference value. Therefore parameters with
low SItot , can be considered as non-influential.
a) First order effects b) Total order effects
Fig. 3. Indices using Sobol’s method for the maximum beam deflection wmax , maxi-
mum stresses σmax , deflection wp and stresses σp at the impact position.
3.1 Obtained Results

First order effects and total effects of the stresses and beam deflection due to
changes in loading characteristics were calculated. Numerical values are pre-
sented in the Fig. 3. It can be seen, that the variation of Young modulus doesn’t
play a major role in the obtained values. Contrariwise the position of the impact
relatively to the boundary conditions and impact amplitude influence strongly
the beam response, as well as impact duration.
4 Uncertainty Analysis of a Model

This section is devoted to the uncertainty analysis of the structural response of
an impacted beam. The realistic situation that can lead to the Domino effect is
modeled. The possible values of impactor properties are unknown and the aim
is to compute the structural response. The uniform distribution of the impulse
characteristics is used.
4.1 Stresses Variability Under Multiple Impacts

Situations of single impact, two impacts with different interval between them
and three impacts with variable time were considered. Impulse characteristics,
such as impact amplitude, duration and position (see Table 1) are considered
as random variables with uniform distribution using Latin hypercube sampling
(LHS). The sample size for single impact is N = 1000, for two impacts N = 2000
and for three N = 3000. Obtained stresses distributions are presented in Figs. 4
and 5. In red are marked number of tests when stresses exceeded the plastic
limit σy .
a) single impact b) three impacts with the interval of 0.001

s
Fig. 4. Probability of possible stresses under impact.
Figure 6 represents the Cumulative Distribution Functions (CDF) of the

stresses for one, two and three safe impactors. The same tendency can be noted:
more impactors are falling on the pipe, the higher is the damage probability.
a) 0.001 s b) 0.0005 s
Fig. 5. Probability of possible stresses under two impacts with different intervals.
Fig. 6. The cumulative distribution functions (CDF’s) of stresses distribution. (1) cor-
responds to the case of one impact, (2.1) to the two impacts with an interval of 0.001
s, (2.2) to the two impacts with an interval of 0.0005 s and (3) corresponds to the three
impacts.
According to Figs. 4, 5, 6, when the interval between two impacts is 0.001 s

more than 5% of impact can provoke plastic deformations. If the impacts will
occur almost simultaneously with interval 0.0005 this number increase up to
8%. And in the case of three impacts it becomes almost 20%. Thus even if the
impactor’s characteristics don’t provoke the plastic deformation in the case of
single impact, two and more impactors of the same size and velocity can damage
the impacted structure.
5 Conclusion
The paper proposes a stochastic analysis of a structural response under a random
impact. A steel pipeline is simulated with Bernoulli beam model and impactors
are introduced in the system by the impulses of rectangular or sinusoidal shapes.
The present research shows a need to consider not only the big/heavy impactors
or impactors with high velocity. Relatively small and slow impactors can cause
the plastic deformations and lead to rupture of a pipe and a following domino
effect. Proposed analysis is conducted on the simplified model. Nevertheless,
the conclusions on the parameters sensitivity give the insights into the problem
modeling. It was shown that the impactor properties and impact position are
more important than the structure material variation for a structural dynamic
response. Also the proposed approach when the impactor is introduced into
the system by its time-force history can save time for more complex numerical
models. Further studies with detailed 3D modeling are ongoing to detect rupture
modes for different kinds of impactors.
Acknowledgment. This research is a part of a project AMED, that has been funded
with the support from the European Union with the European Regional Development
Fund (ERDF) and from the Regional Council of Normandie.
References
1. Abrate, S.: Soft impacts on aerospace structures. Prog. Aerosp. Sci. 81, 1–17 (2016)
2. Alizadeh, A.A., Mirdamadi, H.R., Pishevar, A.: Reliability analysis of pipe convey-
ing fluid with stochastic structural and fluid parameters. Eng. Struct. 122, 24–32
(2016)
3. Andreaus, U., Casini, P.: Dynamics of sdof oscillators with hysteretic motion-
limiting stop. Nonlinear Dyn. 22(2), 145–164 (2000)
4. Antoine, G., Batra, R.: Sensitivity analysis of low-velocity impact response of lam-
inated plates. Int. J. Impact Eng. 78, 64–80 (2015)
5. Fyllingen, Ø., Hopperstad, O., Langseth, M.: Stochastic simulations of square alu-
minium tubes subjected to axial loading. Int. J. Impact Eng. 34(10), 1619–1636
(2007)
6. Hadianfard, M.A., Malekpour, S., Momeni, M.: Reliability analysis of h-section
steel columns under blast loading. Struct. Saf. 75, 45–56 (2018)
7. Kelliher, D., Sutton-Swaby, K.: Stochastic representation of blast load damage in
a reinforced concrete building. Struct. Saf. 34(1), 407–417 (2012)
8. Li, Q., Liu, Y.: Uncertain dynamic response of a deterministic elastic-plastic beam.
Int. J. Impact Eng. 28(6), 643–651 (2003)
9. Lönn, D., Fyllingen, Ø., Nilssona, L.: An approach to robust optimization of impact
problems using random samples and meta-modelling. Int. J. Impact Eng. 37(6),
723–734 (2010)
10. McKay, M.D., Beckman, R.J., Conover, W.J.: A comparison of three methods for
selecting values of input variables in the analysis of output from a computer code.
Technometrics 42(1), 55–61 (2000)
11. Perera, S., Lam, N., Pathirana, M., Zhang, L., Ruan, D., Gad, E.: Deterministic
solutions for contact force generated by impact of windborne debris. Int. J. Impact
Eng. 91, 126–141 (2016)
12. Ren, Y., Qiu, X., Yu, T.: The sensitivity analysis of a geometrically unstable struc-
ture under various pulse loading. Int. J. Impact Eng. 70, 62–72 (2014)
13. Riha, D., Thacker, B., Pleming, J., Walker, J., Mullin, S., Weiss, C., Rodriguez, E.,
Leslie, P.: Verification and validation for a penetration model using a deterministic
and probabilistic design tool. Int. J. Impact Eng. 33(1–12), 681–690 (2006)
14. Shinohara, Y., Madi, Y., Besson, J.: A combined phenomenological model for the
representation of anisotropic hardening behavior in high strength steel line pipes.
Eur. J. Mech.A Solids 29(6), 917–927 (2010)
15. Shinohara, Y., Madi, Y., Besson, J.: Anisotropic ductile failure of a high-strength
line pipe steel. Int. J. Fract. 197(2), 127–145 (2016)
16. Sobol, I.M.: Global sensitivity indices for nonlinear mathematical models and their
monte carlo estimates. Math. Comput. Simul. 55(1), 271–280 (2001)
17. Timashev, S., Bushinskaya, A.: Methods of assessing integrity of pipeline systems
with different types of defects. In: Diagnostics and Reliability of Pipeline Systems,
pp. 9–43. Springer (2016)
18. Villavicencio, R., Soares, C.G.: Numerical modelling of the boundary conditions
on beams stuck transversely by a mass. Int. J. Impact Eng. 38(5), 384–396 (2011)
19. Van der Voort, M., Weerheijm, J.: A statistical description of explosion produced
debris dispersion. Int. J. Impact Eng. 59, 29–37 (2013)
20. Wagner, H., Hühne, C., Niemann, S., Khakimova, R.: Robust design criterion
for axially loaded cylindrical shells-simulation and validation. Thin-Walled Struct.
115, 154–162 (2017)
21. Zhu, Y., Qian, X.m., Liu, Z.y., Huang, P., Yuan, M.q.: Analysis and assessment of
the qingdao crude oil vapor explosion accident: lessons learnt. J. Loss Prev. Process
Ind. 33, 289–303 (2015)
Multiobjective Programming
A Global Optimization Algorithm for the
Solution of Tri-Level Mixed-Integer
Quadratic Programming Problems
Styliani Avraamidou and Efstratios N. Pistikopoulos(B)
Texas A&M Energy Institute, Texas A&M University,

College Station, TX 77843, USA
{styliana,stratos}@tamu.edu
Abstract. A novel algorithm for the global solution of a class of tri-level

mixed-integer quadratic optimization problems containing both integer
and continuous variables at all three optimization levels is presented.
The class of problems we consider assumes that the quadratic terms
in the objective function of the second level optimization problem do
not contain any third level variables. To our knowledge, no other solu-
tion algorithm can tackle the class of problems considered in this work.
Based on multi-parametric theory and our earlier results for tri-level lin-
ear programming problems, the main idea of the presented algorithm is
to recast the lower levels of the tri-level optimization problem as multi-
parametric programming problems, in which the optimization variables
(continuous and integer) of all the upper level problems, are considered
as parameters at the lower levels. The resulting parametric solutions are
then substituted into the corresponding higher-level problems sequen-
tially. Computational studies are presented to asses the efficiency and
performance of the presented algorithm.
Keywords: Tri-level optimization · Multi-parametric programming ·

Mixed-integer optimization
1 Introduction
Optimization problems that involve three decision makers at three different deci-
sion levels are referred to as tri-level optimization problems. The first decision
maker, also referred to as the leader, solves an optimization problem which
includes in its constraint set another optimization problem solved by a second
decision maker, that it is in turn constraint by a third optimization problem
solved by the third decision maker.
A tri-level problem formulation can be applied to many different applications
in different fields including operations research, process engineering, and man-
agement. Moreover, tri-level problems can involve both discrete and continuous
Supported by Texas A&M Energy Institute, RAPID SYNOPSIS Project (DE-
EE0007888-09-03) and National Science Foundation grant [1739977].
https://doi.org/10.1007/978-3-030-21803-4_58
580 S. Avraamidou and E. N. Pistikopoulos
decision variables, as they have been used to formulate supply chain manage-
ment problems [23], safety and defense [1,6,24] or robust optimization [7,13,14]
problems. Mixed-integer tri-level problems have the general form of (1), where
x is a vector of continuous variables, and y is a vector of discrete variables.
min F1 (x, y)
x1 ,y1
s.t. G1 (x, y) ≤ 0
min F2 (x, y)
x2 ,y2
s.t. G2 (x, y) ≤ 0 (1)
min F3 (x, y)
x3 ,y3
s.t. G3 (x, y) ≤ 0
x = [xT1 xT2 xT3 ]T , y = [y1T y2T y3T ]T
x ∈ Rn , y ∈ Zp
This manuscript is organized as follows. The following sub-section presents pre-
vious work on solution algorithms for tri-level problems, Sect. 2 presents the
class of problems considered and presents the proposed algorithm. In Sect. 3
computational studies are presented and Sect. 4 concludes this manuscript.
1.1 Previous Work

Tri-level programming problems are very challenging to solve even when consid-
ering continuous linear problems [5]. Therefore, solution approaches presented in
the literature for tri-level problems are sparse, addressed a very restricted class
of problems, and most do not guarantee global optimality.
Table 1 summarizes key solution methods for mixed-integer multi-level prob-
lems. It is worth noting here that all the presented approaches are only applicable
to linear tri-level problems.
Table 2 presents an indicative list of solution methods for non-linear multi-
level problems. None of the approaches presented here is able to tackle integer
variables at any decision level. It is therefore evident that general strategies for
the solution of mixed-integer non-linear tri-level problems are still lacking.
Table 1. Indicative list of previous work on mixed-integer multi-level optimization

problems with three or more optimization levels.
Class Algorithm Reference Note

Integer linear Tabu search [20] Sub-optimal solutions
Genetic Algorithms [21]
Mixed-integer linear Decomposition algo- [24] Exact and global, applicable
rithm only for min-max-min prob-
lems
Multi-parametric Pro- [4] Exact and global
gramming
A Global Optimization Algorithm for the Solution of T-MIQP Problems 581
Table 2. Indicative list of previous work on non-linear multi-level optimization prob-

lems with three or more optimization levels.
Class Algorithm Reference Note

Continuous quatratic Multi-parametric [9] Exact and global
programming
Continuous non-linear Particle swarm opti- [10] Sub-optimal solutions
mization
Evolutionary algo- [22] Sub-optimal solutions
rithm
Multi-parametric [11] Approximate global
programming (B&B) optimum
2 Tri-Level Mixed-Integer Quadratic Optimization

Algorithm
In this work, we consider the tri-level problem presented as (2). The problem
contains linear constraints and quadratic objective functions in all three opti-
mization levels. The quadratic term in the objective function of the second level
problem’s objective function does not contain any third level variable.
min ω T Q1 ω + cT1 ω + cc1

x1 ,y1
s.t. A1 x + E1 y ≤ b1
min [xT1 y1T xT2 y2T ]Q2 [xT1 y1T xT2 y2T ]T + cT2 ω + cc2
x2 ,y2
s.t. A2 x + E2 y ≤ b2 (2)
x3 ,y3
s.t. A3 x + E3 y ≤ b3
x ∈ Rn , y ∈ {0, 1}m
x = [xT1 xT2 xT3 ]T , y = [y1T y2T y3T ]T , ω = [xT1 y1T xT2 y2T xT3 y3T ]T
where ω is a vector of all decision variables of all decision levels, xi are continuous
bounded decision variables and yi binary decision variables of optimization level
i, Qi 0, ci and cci are constant coefficient matrices in the objective function of
optimization level i, Ai , Ei are constant coefficient matrices multiplying decision
variables of level i in the constraint set, and b is a constant value vector.
Faisca et al. [9] presented an algorithm for the solution of continuous tri-level
programming problems using multi-parametric programming [15]. Avraamidou
and Pistikopoulos [4] expanded on that approach and presented an algorithm for
the solution of mixed-integer linear tri-level problems. The approach presented
here is an extension to these algorithms and tackles the more general mixed-
integer quadratic tri-level problem.
The proposed algorithm will be introduced through the general form of the
tri-level mixed-integer quadratic programming problem (2) and then illustrated
through a numerical example in Subsect. 2.1.
The first step of the proposed algorithm is to recast the third level optimiza-
tion problem as multi-parametric mixed-integer quadratic programming prob-
lem, in which the optimization variables of the second and first level problems
are considered as parameters (3).

x3 ,y3
s.t. A3 x + E3 y ≤ b3 (3)
xL ≤ x ≤ xU
Problem (3) is then solved using multi-parametric mixed-integer quadratic (mp-

MIQP) algorithms in POP toolbox [16] allowing for the solution to contain
envelops of solutions [8,17].
Remark 1. Envelops of solutions contain multiple candidate parametric solu-

tions and require a comparison procedure for the determination of the optimal
solution. Envelops of solutions can appear when solving mp-MIQP problems as
a result of multiple parametric solutions for different realizations of the binary
variables. POP toolbox follows a comparison procedure to get the exact and
optimal critical regions in the space of the parameters and discard sub-optimal
solutions in envelopes. This comparison procedure is not performed at this step
of the presented tri-level algorithm, as a comparison procedure at the end of the
algorithm is more computationally efficient.
The solution of problem (3) results to the parametric solution (4) that con-
sists of the complete profile of optimal solutions of the third level variables, x3
and y3 as explicit functions of the decision variables of optimization levels one
and two (x1 , y1 , x2 , y2 ).
⎧
⎪
⎪ ξ1 = p1 + q1 [xT1 y1T xT2 y2T ]T if H1 [xT1 y1T xT2 y2T ]T ≤ h1 , y3 = r1
⎪
⎪
⎨ξ2 = p2 + q2 [xT1 y1T xT2 y2T ]T if H2 [xT1 y1T xT2 y2T ]T ≤ h2 , y3 = r2
x3 = . .. (4)
⎪
⎪ ..
⎪
⎪
.
⎩
ξk = pk + qk [xT1 y1T xT2 y2T ]T if Hk [xT1 y1T xT2 y2T ]T ≤ hk , y3 = rk
where ξi is the affine function of the third level continuous variables in terms
of the first and second level decision variables, Hi [xT1 y1T xT2 y2T ]T ≤ hi , y3 = ri
is referred to as critical region i, CRi , and k denotes the number of computed
critical regions.
The next step is to recast the second level optimization problem into k mp-
MIQP problems, by considering the optimization variables of the first level prob-
lem, x1 , y1 , as parameters and substituting in the corresponding functions ξi of
x3 and y3 . Also, the corresponding critical region definitions are added to the
existing set of second level constraints, as an additional set of constraints for
each problem.
The k formulated problems are solved using POP toolbox, providing the
complete profile of optimal solutions of the second level problem (for an optimal
third level problem), as explicit functions of the decision variables of the first
level problem, x1 and y1 .
The computed parametric solution is in turn used to formulate single-level
reformulations of the upper level problem by substituting the derived critical
region definitions and affine functions of the variables in the leader problem,
forming a single level mixed-integer quadratic programming (MIQP) problem for
each critical region. The single-level MIQP problems are solved with appropriate
algorithms (CPLEX R if convex, and either BARON R [19] or ANTIGONE R
[12] if not convex).
The final step of the algorithm is a comparison procedure to select the global
optimum solution. This is done by solving the mixed-integer linear problem (5).
z ∗ = min α
α,γ
s.t. α = γi,j zi,j
i,j
γi,j = 1 (5)
i,j
γi,j ui,j ≤ γi,j up,q ∀i, j, p = i, q
γi,j vi ≤ γi,j vr ∀i, j, r = i
γi,j ∈ {0, 1}
where z ∗ is the exact global optimum of problem (2), γi,j are binary variables
corresponding to each CRi,j , zi,j are the objective function values obtained when
solving problems in Step 6, ui are the objective function values obtained when
solving problems in Step 4, and vi are the objective function values obtained
when solving problems in Step 2.
2.1 Numerical Example
Consider the following tri-level mixed-integer quadratic problem (6).
min z = 5x1 2 + 6x2 2 + 3y1 + 3y2 − 3x3

x1 ,y1
min u = 4x1 2 + 6y1 − 2x2 + 10y2 − x3 + 5y3
x2 ,y2
min v = 4x3 2 + y3 2 + 5y4 2 + x2 y3 + x2 y4 − 10x3 − 15y3 − 16y4
x3 ,y3
s.t. 6.4x1 + 7.2x2 + 2.5x3 ≤ 11.5
−8x1 − 4.9x2 − 3.2x3 ≤ 5
3.3x1 + 4.1x2 + 0.02x3 + 0.2y1 + 0.8y2 + 4y3 + 4.5y4 ≤ 1
y1 + y 2 + y 3 + y 4 ≥ 1
−10 ≤ x1 , x2 ≤ 10
x1 , x2 , x3 ∈ R, y1 , y2 , y3 , y4 ∈ {0, 1}
(6)
Step 1: The third level problem is reformulated as a mp-MIQP problem (7),
in which all decision variables of the first and second level problems (x1 , y1 , x2 , y2 )
are considered as parameters.
min v = 4x3 2 + y3 2 + 5y4 2 + x2 y3 + x2 y4 − 10x3 − 15y3 − 16y4

x3 ,y3
s.t. 6.4x1 + 7.2x2 + 2.5x3 ≤ 11.5
−8x1 − 4.9x2 − 3.2x3 ≤ 5
3.3x1 + 4.1x2 + 0.02x3 + 0.2y1 + 0.8y2 + 4y3 + 4.5y4 ≤ 1 (7)
y1 + y2 + y3 + y4 ≥ 1
−10 ≤ x1 , x2 ≤ 10
x1 , x2 , x3 ∈ R, y1 , y2 , y3 , y4 ∈ {0, 1}
Step 2: Problem (7) is solved using the mp-MIQP solver in POP R toolbox
[16]. The multi-parametric solution of problem (7) consists of 14 critical regions.
A subset of them is presented in Table 3.
Table 3. Numerical Example: A sub-set of the multi-parametric solution of the third

level problem
CR Definition 3rd Level Obj. 3rd Level Var.

0.6236x1 + 0.7759x2 v1 = 13.1072x1 2
+0.0960y2 ≤ 0.1743 +16.5888x2 2 x3 = −2.56x1
−0.6644x1 − 0.7474x2 ≤ −0.2206 +29.4912x1 x2 −2.88x2 + 4.6
CR1
−y1 − y2 ≤ −1 −8.7040x1 y3 = 0
x1 ≤ 10 −9.7920x2 y4 = 0
y1 , y2 ∈ {0, 1} −26.6800
v6 = 54450x1 2
0.62119x1 + 0.7778x2 +84050x2 2
+0.0956y2 ≤ −0.5674 +1250y2 2 x3 = −165x1
−0.6242x1 − 0.7755x2 +16.5888x2 2 −205x2 − 25y2
−0.0946y2 ≤ 0.5816 +135300x1 x2 −150
CR6 y1 + y 2 ≤ 1 +16500x1 y2 y3 = 1
x1 ≤ 10 +20500x2 y2 y4 = 0
y1 , y2 ∈ {0, 1} +101475x1
+126076x2
+15375y2 + 15375
0.6212x1 + 0.7778x2 v14 = 12.5x1 2
+0.0956y2 ≤ 0.1743 +4.6895x2 2 x3 = −2.5x1
0.8528x1 + 0.5223x2 ≤ −0.2206 +15.3125x1 x2 −1.5312x2
0.0444x1 + 0.9990x2 ≤ −0.2206 +53.1250x1 −1.5625
CR14
y1 + y 2 ≤ 1 +32.5391x2 y3 = 1
−x1 ≥ 10 −9.7920x2 y4 = 1
−x2 ≥ 10 +0.3203
y1 , y2 ∈ {0, 1}
Step 3: The multi-parametric solution of problem (7), partially presented in

Table 3, is used to formulate 14 mp-MIQP second level problems. The critical
region definitions are added to the reformulated second level problems as a new
set of constraints. The affine functions of the third level variables, x3 , are sub-
stituted in the problems, along with the value of the binary third level variables.
Finally, decision variables of the first level problem are considered as parame-
ters. The first mp-MIQP formulated corresponds to CR1 and is presented as (8).
Similar problems are formulated for the rest of the critical regions.
min u1,1 = 4x1 2 + 4y2 2 + 6y1 − 2x2 + 6y2 − (2.56x1 − 2.88x2 + 4.6) + 5(0)
x2 ,y2
s.t. 0.6236x1 + 0.7759x2 + 0.0960y2 ≤ 0.1743
−0.6644x1 − 0.7474x2 ≤ −0.2206
−y1 − y2 ≤ −1
x1 ≤ 10
y1 , y2 ∈ {0, 1}
(8)
Step 4: All the problems formulated in Step 3 are solved using the mp-MIQP
solvers in POP R toolbox [16]. The resulting solutions consisted of a total of 22
critical regions. The critical regions corresponding to CR1 and CR6 are presented
in Table 4.
Table 4. Numerical example: A sub-set of the multi-parametric solution of second

level problems
CR Definition 2nd level objective 2nd level var.

CR1,1 2.2792 ≤ x1 ≤ 10 u1,1 = 4x1 2 + 1.7778x1 x2 = −0.8889x1 + 0.2951
y1 ∈ {0, 1} +6y1 + 3.6597 y2 = 1
CR6,1 −3.2852 ≤ x1 ≤ 10 u6,1 = 4x1 2 + 1.6098x1 x2 = −0.8049x1 − 0.75
y1 ∈ {0, 1} +6y1 + 2.75 y2 = 0
Step 5: The parametric solutions for the second level problem obtained in
Step 4 are used to formulate 22 single level deterministic MIQP problems, each
corresponding to a critical region of the second level problem. Each critical region
definition is added to the first level problem as a new set of constraints and the
affine functions of the second and third level decision variables are substituted in
the objective, resulting into MIQP problems that involve only first level variables
x1 , y1 . The MIQPs formulated from CR1,1 and CR6,1 are presented below as (9)
and (10) respectively.
2
min z1,1 = 5x1 2 + 6(−0.8889x1 + 0.2951) + 3y1 + 3(1)
x1 ,y1
−3(−2.56x1 − 2.88(−0.8889x1 + 0.2951) + 4.6) (9)
s.t. 2.2792 ≤ x1 ≤ 10
y1 ∈ {0, 1}
2
min z6,1 = 5x1 2 + 6(−0.8049x1 − 0.75) + 3y1 + 3(0)
x1 ,y1
−3(−165x1 − 205(−0.8049x1 − 0.75) − 25) (10)
s.t. − 3.2852 ≤ x1 ≤ 10
y1 ∈ {0, 1}
Step 6: The 22 single level MIQP problems formulated in Step 5 are solved
using CPLEX R mixed-integer quadratic programming solver. The resulting
solutions from problems (9) and (10) are presented in Table 5.
Table 5. Numerical Example: A sub-set of the first level problem solutions
CR Objectives Decision variables

CR1,1 z1,1 = 35.6995, u1,1 = 30.4913, x1 = 2.2792, y1 = 0, x2 = −1.7308
v=0 y2 = 1, x3 = 3.7500, y3 = 0, y4 = 0
CR6,1 z6,1 = −9.3512, u6,1 = 2.7583, x1 = −0.4076, y1 = 0, x2 = −0.4220
v = −43.0470 y2 = 0, x3 = 3.7500, y3 = 1, y4 = 0
Step 7: The comparison optimization problem (5) is then solved using the
information in Tables 3, 4 and 5. The exact global optimum is lying in CR6,1 with
optimal decisions x1 = −0.4076, y1 = 0, x2 = −0.4220y2 = 0, x3 = 3.7500, y3 = 1
and y4 = 0.
The computational performance of the algorithm for this numerical example
is presented in Table 6.
3 Computational Studies
A small set of tri-level mixed-integer quadratic problems of different sizes was
solved to investigate the capabilities of the proposed algorithm. The randomly
generated problems have the general mathematical form of (2) and all variables
appear in all three optimization levels. Table 6 presents the studied problems,
where XT denotes the total number of continuous variables of the tri-level prob-
lem, YT denotes the total number of binary variables of the tri-level problem,
X1 , X2 and X3 denote the number of continuous decision variables of the first,
second and third optimization level respectively, Y1 , Y2 and Y3 denote the num-
ber of binary decision variables of the first, second and third optimization level
respectively, C denotes the total number of constraints in the first, second and
third optimization level, L1, L2, and L3 denote the time required to solve each
optimization level, Com denotes the time required to solve the comparison prob-
lem and CPU denotes the total computational time for each test problem in
seconds.
The computations were carried out on a 2-core machine with an Intel Core
i7 at 3.1 GHz and 16 GB of RAM, MATLAB R2016a, and IBM ILOG CPLEX
Optimization Studio 12.6.3. The test problems presented in Table 6 can be found
in parametric.tamu.edu website as ‘BPOP TMIQP’.
Table 6. Computational results of the presented algorithm for tri-level MIQP problems
of the general form (2)
Problem XT YT X1 X2 X3 Y1 Y2 Y3 C L3(s) L2(s) L1(s) Com(s) CPU(s)

(6) 3 4 1 1 1 1 1 2 4 4.4278 1.7434 0.1291 0.0125 6.3128
TMIQP1 13 7 3 5 5 2 2 3 4 2.3567 0.5560 0.0042 0.0361 2.9530
TMIQP2 15 6 5 5 5 2 2 2 5 1.2253 0.1884 0.0021 0.0004 1.4161
TMIQP3 16 5 1 1 14 1 1 3 6 0.7164 0.8937 0.0018 0.0004 1.6123
TMIQP4 18 7 1 2 15 1 1 5 3 6.3971 1.9648 0.0081 0.0026 8.3726
4 Conclusions
An algorithm for the solution of tri-level mixed-integer quadratic problems is

introduced, as an extension to the global solution algorithms recently developed
for different classes of mixed-integer multi-level problems using multi-parametric
programming [2–4,18]. The problem under consideration involves both integer
and continuous variables at all optimization levels and has linear constraints and
quadratic objective functions, with the quadratic terms in the objective function
of the second level problem not containing third level variables.
This is the only algorithm, to our knowledge that can handle mixed-integer
non-linear tri-level problems. The algorithm has been implemented in a MAT-
LAB based toolbox, and its performance and efficiency were assessed through a
set of randomly generated test problems. The limiting step of the proposed algo-
rithm was shown to be the solution of the multi-parametric problem in Step 2.
Future work will involve the use of the presented algorithm to solve a robust
optimization problem case study. The presented procedure will also be expanded
for the solution of more general classes of mixed-integer tri-level problems.
References
1. Alguacil, N., Delgadillo, A., Arroyo, J.: A trilevel programming approach for elec-
tric grid defense planning. Comput. Oper. Res. 41(1), 282–290 (2014)
2. Avraamidou, S., Pistikopoulos, E.N.: B-POP: Bi-level parametric optimization
toolbox. Comput. Chem. Eng. 122, 193–202 (2018)
3. Avraamidou, S., Pistikopoulos, E.N.: A Multi-Parametric optimization approach
for bilevel mixed-integer linear and quadratic programming problems. Comput.
Chem. Eng. 122, 98–113 (2019)
4. Avraamidou, S., Pistikopoulos, E.N.: Multi-parametric global optimization app-
roach for tri-level mixed-integer linear optimization problems. J. Global Optim.
(2018)
5. Blair, C.: The computational complexity of multi-level linear programs. Ann. Oper.
Res. 34(1), 13–19 (1992)
6. Brown, G., Carlyle, M., Salmerón, J., Wood, K.: Defending critical infrastructure.
Interfaces 36(6), 530–544 (2006)
7. Chen, B., Wang, J., Wang, L., He, Y., Wang, Z.: Robust optimization for trans-
mission expansion planning: minimax cost vs. minimax regret. IEEE Trans. Power
Syst. 29(6), 3069–3077 (2014)
8. Dua, V., Bozinis, N., Pistikopoulos, E.: A multiparametric programming approach
for mixed-integer quadratic engineering problems. Comput. Chem. Eng. 26(4–5),
715–733 (2002)
9. Faisca, N.P., Saraiva, P.M., Rustem, B., Pistikopoulos, E.N.: A multi-parametric
programming approach for multilevel hierarchical and decentralised optimisation
problems. Comput. Manag. Sci. 6, 377–397 (2009)
10. Han, J., Zhang, G., Hu, Y., Lu, J.: A solution to bi/tri-level programming problems
using particle swarm optimization. Inf. Sci. 370–371, 519–537 (2016)
11. Kassa, A., Kassa, S.: A branch-and-bound multi-parametric programming app-
roach for non-convex multilevel optimization with polyhedral constraints. J. Global
Optim. 64(4), 745–764 (2016)
12. Misener, R., Floudas, C.: Antigone: algorithms for continuous/integer global opti-
mization of nonlinear equations. J. Global Optim. 59(2–3), 503–526 (2014)
13. Moreira, A., Street, A., Arroyo, J.: An adjustable robust optimization approach
for contingency-constrained transmission expansion planning. IEEE Trans. Power
Syst. 30(4), 2013–2022 (2015)
14. Ning, C., You, F.: Data-driven adaptive nested robust optimization: general mod-
eling framework and efficient computational algorithm for decision making under
uncertainty. AIChE J. 63, 3790–3817 (2017)
15. Oberdieck, R., Diangelakis, N., Nascu, I., Papathanasiou, M., Sun, M., Avraami-
dou, S., Pistikopoulos, E.: On multi-parametric programming and its applications
in process systems engineering. Chem. Eng. Res. Design 116, 61–82 (2016)
16. Oberdieck, R., Diangelakis, N., Papathanasiou, M., Nascu, I., Pistikopoulos, E.:
Pop-parametric optimization toolbox. Ind. Eng. Chem. Res. 55(33), 8979–8991
(2016)
17. Oberdieck, R., Pistikopoulos, E.: Explicit hybrid model-predictive control: the
exact solution. Automatica 58, 152–159 (2015)
18. Oberdieck, R., Diangelakis, N.A., Avraamidou, S., Pistikopoulos, E.N.: On
unbounded and binary parameters in multi-parametric programming: Applications
to mixed-integer bilevel optimization and duality theory. J. Glob. Optim. 69(3),
587–606 (2017)
19. Sahinidis, N.: BARON 17.8.9: Global Optimization of Mixed-Integer Nonlinear
Programs, User’s Manual
20. Sakawa, M., Matsui, T.: Interactive fuzzy stochastic multi-level 0–1 programming
using tabu search and probability maximization. Expert Syst. Appl. 41(6), 2957–
2963 (2014)
21. Sakawa, M., Nishizaki, I., Hitaka, M.: Interactive fuzzy programming for multi-
level 0–1 programming problems through genetic algorithms. Eur. J. Oper. Res.
114(3), 580–588 (1999)
22. Woldemariam, A., Kassa, S.: Systematic evolutionary algorithm for general mul-
tilevel Stackelberg problems with bounded decision variables (SEAMSP). Ann.
Oper. Res. (2015)
23. Xu, X., Meng, Z., Shen, R.: A tri-level programming model based on conditional
value-at-risk for three-stage supply chain management. Comput. Ind. Eng. 66(2),
470–475 (2013)
24. Yao, Y., Edmunds, T., Papageorgiou, D., Alvarez, R.: Trilevel optimization in
power network defense. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 37(4),
712–718 (2007)
A Method for Solving Some Class of
Multilevel Multi-leader Multi-follower
Programming Problems
Addis Belete Zewde1 and Semu Mitiku Kassa1,2(B)

1
Department of Mathematics, Addis Ababa University,
P.O.Box 1176, Addis Ababa, Ethiopia
addisbele@gmail.com
2
Department of Mathematics and Statistical Sciences, Botswana International
University of Science and Technology, P/Bag 16, Palapye, Botswana
kassas@biust.ac.bw
Abstract. Multiple leaders with multiple followers games serve as an

important model in game theory with many applications in economics,
engineering, operations research and other fields. In this paper, we have
reformulated a multilevel multi-leader multiple follower (MLMLMF) pro-
gramming problem into an equivalent multilevel single-leader multiple fol-
lower (MLSLMF) programming problem by introducing a suppositional
(or dummy) leader. If the resulting MLSLMF programming problem con-
sist of separable terms and parameterized common terms across all the
followers, then the problem is transformed into an equivalent multilevel
programs having a single leader and single follower at each level of the
hierarchy. The proposed solution approach can solve multilevel multi-
leader multi-follower problems whose objective values in both levels have
common, but having different positive weights of, nonseparable terms.
Keywords: Multilevel multi-leader multi-follower programming ·

Multilevel programming · Multi-parametric programming ·
Branch-and-bound
1 Introduction
Multi-leader-follower games are a class of hierarchical games in which a collection
of leaders compete in a Nash game constrained by the equilibrium conditions of
another Nash game amongst the followers. Generally, in a game, when several
players take the position as leaders and the rest of players take the position
as followers, it becomes a multi-leader-follower game. The leader-follower Nash
equilibrium, a solution concept for the multi-leader-follower game, can be defined
as a set of leaders’ and followers’ strategies such that no player (leader or follower)
can improve his status by changing his own current strategy unilaterally.
The early study associated with the multi-leader-follower game and equilib-
rium problem with equilibrium constraints (EPEC) could date back to 1984 by
https://doi.org/10.1007/978-3-030-21803-4_59
590 A. B. Zewde and S. M. Kassa
Sherali [14], where a multi-leader-follower game was called a multiple Stackelberg

model. While multi-leader generalizations were touched upon by Okuguchi [11],
Sherali [14] presented amongst the first models for multi-leader-follower games in
a Cournot regime. Sherali [14] established existence of an equilibrium by assum-
ing that each leader can exactly anticipate the aggregate follower reaction curve.
He also showed the uniqueness of equilibrium for a special case where all leaders
share an identical cost function and make identical decisions. As Ehrenmann
[2,3] pointed out, the assumption that all leaders make identical decisions is
essential for ensuring the uniqueness result. In addition, Su [16] considered a
forward market equilibrium model that extended the existence result of Sher-
ali [14] under some weaker assumptions. Pang and Fukushima [13] considered
a class of remedial models for the multi-leader-follower game that can be for-
mulated as a generalized Nash equilibrium problem (GNEP) with convexified
strategy sets. Based on the strong stationarity conditions of each leader in a
multi-leader-follower game, Leyffer and Munson [10] derived a family of non-
linear complementarity problem, nonlinear program, and MPEC formulations of
the multi-leader multi-follower games. In [15], Su proposed a sequential nonlinear
complementarity problem (NCP) approach for solving EPECs.
There are several instances of EPECs for which equilibria have been shown to
exist, but there are also fairly simple EPECs which admit no equilibria as shown
in [12]. Definitive statements on the existence of equilibria have been obtained
mainly for two level multi-leader-follower games with specific structure. In the
majority of these settings, the uniqueness of the follower-level equilibrium is
assumed to construct an implicit form (such as problems with convex strat-
egy sets) which allows for the application of standard fixed-point theorems of
Brouwer and Kakutani [1,4]. Indeed when the feasible region of the EPEC is con-
vex and compact, the multi-leader multi-follower game can be thought of as a
conventional Nash game or a generalized Nash game and the existence of a global
equilibrium follows from classical results. But the equilibrium constraint in an
EPEC is known for being nonconvex and for lacking the continuity properties
required to apply fixed point theory. Consequently, most standard approaches fail
to apply to EPECs and there currently exists no general mathematical paradigm
that could be built upon to make a theory for general EPECs.
In [8], Kulkarni identified subclasses of the non-shared constraint multi-leader
multi-follower games for which the existence of equilibria can be guaranteed and
showed that when the leader-level problem admits a potential function, the set
of global minimizers of the potential function over the shared constraint are the
equilibria of the multi-leader multi-follower game; in effect, this reduces to a
question of the existence of an equilibrium to that of the existence of a solution
to an MPEC. The above result is extended later in [9] to work for quasi-potential
functions.
In [17], Sun have reformulated the generalized Nash equilibrium problem
into an equivalent bilevel programming problem with one leader and multiple
followers. In particular, if the followers problems are separable, then it has been
shown that the generalized Nash equilibrium problem is equivalent to the bilevel
programming problem having a single decision maker at both levels.
A Method for Solving Some Class of Multilevel Multi-leader 591
In [7], Kassa and Kassa have reformulated the class of multilevel programs
with single leader and multiple followers, that consist of separable terms and
parameterized common terms across all the followers, into equivalent multilevel
programs having single follower at each level. Then the resulting (non-convex)
multilevel problem is solved by a solution strategy, called a branch-and-bound
multi-parametric programming approach, they have developed in [6].
In most of the literature reviewed above, the existence of equilibria have been
obtained mainly for multi-leader-follower games with specific structure (such as
bilevel case, single leader case) and with constraint sets and/or objective func-
tions assumed to have a nice property (such as linearity, convexity, differentia-
bility, separability). In this paper we consider an equivalent reformulation of
a multilevel multi-leader multi-follower programming problem into a multilevel
single leader multiple follower programming problem with one more level of hier-
archy. Then for some special classes of problems, the reformulated problem is
transformed into an equivalent multilevel programs having only single follower
at each level of the hierarchy - and hence proposing a solution approach for
multilevel multi-leader-follower games.
Multilevel programs involving multiple decision makers at each level over the
hierarchy are called multilevel multi-leader multi-follower (MLMLMF) program-
ming problem.
A general k-level multi-leader multi-follower programming problem involving
N leaders and multiple followers at each level can be described by:
min F1n (y1n , y1−n , y2i , y2−i , y3j , y3−j , . . . , ykl , yk−l ), n ∈ 1, . . . , N
n ∈Y n
y1 1
s.t. Gn n i −i j −j l −l
1 (y1 , y2 , y2 , y3 , y3 , . . . , yk , yk ) ≤ 0
H1 (y1n , y1−n , y2i , y2−i , y3j , y3−j , . . . , ykl , yk−l ) ≤ 0

min f2i (y1n , y1−n , y2i , y2−i , y3j , y3−j , . . . , ykl , yk−l ), i ∈ 1, . . . , I
i ∈Y i
y2 2
s.t. g2i (y1n , y1−n , y2i , y3j , y3−j , . . . , ykl , yk−l ) ≤ 0,

h2 (y1n , y1−n , y2i , y2−i , y3j , y3−j , . . . , ykl , yk−l ) ≤ 0
min f3j (y1n , y1−n , y2i , y2−i , y3j , y3−j , . . . , ykl , yk−l ), j ∈ 1, . . . , J (1)
j j
y3 ∈Y3
s.t. g3j (y1n , y1−n , y2i , y2−i , y3j , . . . , ykl , yk−l ) ≤ 0,

h3 (y1n , y1−n , y2i , y2−i , y3j , y3−j , . . . , ykl , yk−l ) ≤ 0
..
.
min fkl (y1n , y1−n , y2i , y2−i , y3j , y3−j , . . . , ykl , yk−l ), l ∈ 1, . . . , L
l ∈Y l
yk k
s.t. gkl (y1n , y1−n , y2i , y2−i , y3j , y3−j , . . . , ykl ) ≤ 0,

hk (y1n , y1−n , y2i , y2−i , y3j , y3−j , . . . , ykl , yk−l ) ≤ 0
where y1n ∈ Y1n is a decision vector for the leader’s optimization problem
and y1−n is a vector of the decision variables for all leaders without the deci-
sion variables y1n , of the nth leader. i.e., y1−n = (y11 , . . . , y1n−1 , y1n+1 , . . . , y1n ),
where, n = 1, 2, . . . , N . The shared constraint H1 is the leaders common con-
straint set whereas, the constraint Gn1 determines the constraint only for the
nth leader. ymc
∈ Ymc is a decision vector for the cth follower at level m, and
−c c−1 c+1 n
ym = (ym , . . . , ym
1
, ym , . . . , ym ), where, c = i, j, . . . , l and m ∈ 2, 3, . . . , k. The
th
shared constraint hm is the m level followers common constraint set whereas,
c
the constraint gm determines the constraint only for the cth follower at the mth
level optimization problem.
3 Equivalent Formulation
In this section, we will consider the equivalent reformulation of a multi-level

programs with multiple leaders and multiple followers into multi-level programs
having a single leader and multiple followers. For the sake of clarity in presenta-
tion, the methodology is described using a bilevel programs with multiple leaders
and multiple followers, however it can be extended to a general k-level case.
Consider a bilevel multi-leader multi-follower (BLMLMF) programming
problem involving N leaders in the upper level problem and M followers at
lower level problem which is defined as:
min Fi (xi , x−i , y j , y −j )

xi ∈X i
s.t. Gi (xi , y j , y −j ) ≤ 0
H(xi , x−i , y j , y −j ) ≤ 0
(2)
min
j j
fj (xi , x−i , y j , y −j )
y ∈Y
s.t. gj (xi , x−i , y j ) ≤ 0

h(xi , x−i , y j , y −j ) ≤ 0
Let us assume that Fi , Gi , H, h, fj , gj , i = 1, 2, . . . , N , j = 1, 2, . . . , M

are twice continuously differentiable functions and that the followers constraint
functions satisfy the Guignard constraint qualifications conditions and let us
define some relevant sets related to problem (2) as follows:
(i) The feasible set of problem (2) is given by:

A = (xi , x−i , y j , y −j ) : gj (xi , x−i , y j ) ≤ 0, h(xi , x−i , y j , y −j ) ≤ 0, Gi (xi , y j , y −j )

≤ 0 H(xi , x−i , y j , y −j ) ≤ 0, i = 1, . . . , N, j = 1, . . . , M .
(ii) The feasible set for the j th follower (for any leaders strategy x = (xi , x−i ))
can be defined as

Aj (xi , x−i , y −j ) = y j ∈ Y j , : gj (xi , x−i , y j ) ≤ 0, h(xi , x−i , y j , y −j ) ≤ 0 .
(iii) The Nash rational reaction set for the j th follower is defined by the set of
parametric solutions,
j
Bj (xi , x−i , y −j ) = ȳ ∈ Y j : ȳ j ∈ argmin fj (xi , x−i , y j , y −j ) :

y j ∈ Aj (xi , x−i , y −j ) , j = 1, . . . , M .
(iv) The feasible set for the ith leader is defined as
i j −j
Ai (x−i ) = (x , y , y ) ∈ X i × Y j × Y −j : Gi (xi , y j , y −j ) ≤ 0, H(xi , x−i , y j , y −j ) ≤ 0,

gj (xi , x−i , y j ) ≤ 0, h(xi , x−i , y j , y −j ) ≤ 0, y j ∈ Bj (xi , x−i , y −j ), j = 1, . . . , M .
(v) The Nash rational reaction set for the ith leader is defined as

Bi (x−i ) = (xi , y j , y −j ) ∈ X i × Y j × Y −j : xi ∈ arg min Fi (xi , x−i , y j , y −j ) :

(xi , y j , y −j ) ∈ Ai (x−i ), i = 1, . . . , N .
(vi) The set of Nash equilibrium points (optimal solutions) of problem (2) is
given by

S = (xi , x−i , y j , y −j ) : (xi , x−i , y j , y −j ) ∈ A, (xi , y j , y −j ) ∈ Bi (x−i ), i = 1, . . . , N .
Equivalent TLSLMF Form for BLMLMF

Now will formulate an equivalent trilevel single-leader multi-follower (TLSLMF)
programming problem for (2) and we will show their equivalence. Let us add
an upper level decision maker, a suppositional (or dummy) leader, to the
problem (2) with the corresponding decision variable z, where z = (x, y) =
(x1 , x2 , . . . , xn , y 1 , y 2 , . . . , y m ), and objective function 0. Then the multiple lead-
ers in the upper level problem of (2) becomes middle-level follower and multiple
followers in the lower level problem of (2) becomes bottom-level follower in the
second level and we will get the following TLSLMF programming:
min 0
z
s.t. z = (x, y),
min Fi (xi , x−i , y j , y −j )
xi
s.t. Gi (xi , y j , y −j ) ≤ 0, i = 1, . . . , N
(3)
H(xi , x−i , y j , y −j ) ≤ 0
min fj (xi , x−i , y j , y −j )
yj
s.t. gj (xi , x−i , y j ) ≤ 0, j = 1, . . . , M

h(xi , x−i , y j , y −j ) ≤ 0.
Let us assume that each of the objective function is convex with respect to
its own decision variable for the second and third level followers and Guignard
constraint qualifications hold for the followers constraints. Moreover, related to
problem (3) we shall denote
(i) the feasible set for the third level followers problem by Ω3 (xi , x−i , y −j );
(ii) the rational reaction set for the third level followers problem by
Ψ3 (xi , x−i , y −j );
(iii) the feasible set for the second level problem by Ω2 (x−i );
(iv) the rational reaction set for the second level followers problem by Ψ2 (x−i );
(v) the feasible set of problem (3) is given by:

Φ = (z, xi , x−i , y j , y −j ) : z = (x, y), gj (xi , x−i , y j ) ≤ 0, h(xi , x−i , y j , y −j ) ≤ 0,

Gi (xi , y j , y −j ) ≤ 0, H(xi , x−i , y j , y −j ) ≤ 0, i = 1, . . . , N, j = 1, . . . , M ;
(vi) and the inducible region of problem (3) is given by:

IR = (z, xi , x−i , y j , y −j ) : (z, xi , x−i , y j , y −j ) ∈ Φ, (xi , y j , y −j ) ∈ Ψ2 (x−i ) .
With these notations and definitions, problem (3) could be rewritten as:
min 0
z
(4)
s.t. (z, xi , x−i , y j , y −j ) ∈ IR
Since every feasible point of (4) is an optimal point, the optimal set of (4) is
given by

S ∗ = IR = (z, xi , x−i , y j , y −j ) : (z, xi , x−i , y j , y −j ) ∈ Φ, (xi , y j , y −j ) ∈ Ψ2 (x−i ) .
Once we have established relations between the BLMLMF problem (2) and
the TLSLMF problem (3). We will describe their equivalence with the following
conclusions.
Theorem 31 A point (x∗,i , x∗,−i , y ∗,j , y ∗,−j ) is an optimal solution to (2) if and
only if (z ∗ , x∗,i , x∗,−i , y ∗,j , y ∗,−j ) is an optimal solution to (4).
Proof:- Suppose that (x∗,i , x∗,−i , y ∗,j , y ∗,−j ) is an optimal solution to (2),
i.e., (x∗,i , x∗,−i , y ∗,j , y ∗,−j ) ∈ S which implies that, (x∗,i , x∗,−i , y ∗,j , y ∗,−j ) ∈
A, (x∗,i , y ∗,j , y ∗,−j ) ∈ Bi (x∗,−i ), i = 1, . . . , N . This implies
(x∗,i , y ∗,j , y ∗,−j ) ∈ Ψ2 (x∗,−i ), gj (x∗,i , x∗,−i , y ∗,j ) ≤ 0, h(x∗,i , x∗,−i , y ∗,j , y ∗,−j ) ≤ 0,
Gi (x∗,i , y ∗,j , y ∗,−j ) ≤ 0, H(x∗,i , x∗,−i , y ∗,j , y ∗,−j ) ≤ 0, i = 1, . . . , N, j = 1, . . . , M.
Then for any point (z ∗ , x∗,i , x∗,−i , y ∗,j , y ∗,−j ), z ∗ = (x∗ , y ∗ ), and
(x∗,i , x∗,−i , y ∗,j , y ∗,−j ) ∈ S, we have
∗,i ∗,j ∗,−j ∗,−i ∗ ∗ ∗ ∗,i ∗,−i ∗,j ∗,i ∗,−i ∗,j ∗,−j
(x ,y ,y ) ∈ Ψ2 (x ), z = (x , y ), gj (x ,x ,y ) ≤ 0, h(x ,x ,y ,y )
∗,i ∗,j ∗,−j ∗,i ∗,−i ∗,j ∗,−j
≤ 0, Gi (x ,y ,y ) ≤ 0, H(x ,x ,y ,y ) ≤ 0, i = 1, . . . , N, j = 1, . . . , M.
This implies that (z ∗ , x∗,i , x∗,−i , y ∗,j , y ∗,−j ) ∈ Φ and (x∗,i , y ∗,j , y ∗,−j ) ∈
Ψ2 (x∗,−i ). Therefore (z ∗ , x∗,i , x∗,−i , y ∗,j , y ∗,−j ) ∈ IR = S ∗ and hence
(z ∗ , x∗,i , x∗,−i , y ∗,j , y ∗,−j ) is an optimal solution to (4).
Conversely, suppose that (z ∗ , x∗,i , x∗,−i , y ∗,j , y ∗,−j ) is an optimal solution to

(4), i.e.,
(z ∗ , x∗,i , x∗,−i , y ∗,j , y ∗,−j ) ∈ S ∗ , then we have (z ∗ , x∗,i , x∗,−i , y ∗,j , y ∗,−j ) ∈ Φ
and (x∗,i , y ∗,j , y ∗,−j ) ∈ Ψ2 (x∗,−i ). This implies the following
∗,i ∗,j ∗,−j ∗,−i ∗,i ∗,−i ∗,j ∗,i ∗,−i ∗,j ∗,−j
(x ,y ,y ) ∈ Bi (x ), gj (x ,x ,y ) ≤ 0, h(x ,x ,y ,y ) ≤ 0,
∗,i ∗,j ∗,−j ∗,i ∗,−i ∗,j ∗,−j
Gi (x ,y ,y ) ≤ 0, H(x ,x ,y ,y ) ≤ 0, i = 1, . . . , N, j = 1, . . . , M, i = 1, . . . , N.
This implies that (x∗,i , x∗,−i , y ∗,j , y ∗,−j ) ∈ A, (x∗,i , y ∗,j , y ∗,−j ) ∈ Bi (x∗,−i ), i =
1, . . . , N . Therefore (x∗,i , x∗,−i , y ∗,j , y ∗,−j ) ∈ S and hence (x∗,i , x∗,−i , y ∗,j , y ∗,−j )
is an optimal solution to (2). 2
Remark 1. The idea described above can be extended to any finite k-level multi-
leader multi-follower programming problem. By adding an upper decision maker,
problem (1) can be equivalently reformulated as (k +1)-level MLSLMF program-
ming. As a result, leaders in the upper level problem of (1) becomes followers at
the second-level and followers at mth -level problem of (1) becomes followers at
th
(m + 1) -level, where m ∈ 2, . . . , k.
4 Solution Approach For Special Problems

In this section we suggest an appropriate solution method to solve a multilevel
program with multiple leader and multiple follower at each decision level. And
we introduce a pseudo algorithmic approach to solve some classes of multilevel
program with multiple leader and multiple follower. The basic steps of the pro-
posed algorithm was as follows:
(1) Reformulate the given multilevel program with multiple leader and multiple
follower into equivalent multilevel program with single leader and multiple
follower as discussed in Sect. 3.
(2) If the resulting problem in step (1) above have a property that at all levels
in the hierarchy each followers’ objective function consisting of separable
terms and parameterized common terms across all followers of the same
level, then it can be reformulated into equivalent multilevel program having
a single follower over the hierarchy as discussed in Ref. [7].
(3) Then to solve the resulting problem in step (2) above, we can apply the
following approaches:
(i) Multi-parametric programming approach for convex case: for
multilevel programming problems having convex quadratic objective func-
tion and affine constraints at each decision levels, we should apply a
multi-parametric programming (MPP) suggested in [5] to solve multi level
hierarchical optimization problems. The approach starts by rewriting the
most inner level optimization problem as a multi-parametric program-
ming problem, where the upper level optimization variables are consid-
ered as parameters. The resulting problem can be solved globally and the
solution can be substituted into the most nearby upper level optimization
problem.
(ii) Branch-and-bound and MPP approach for non-convex case:

when a multilevel programming problem contains a special non-convexity
formulation in the objectives at each decision level and constraints at
each level are polyhedral, we should apply the branch-and-bound multi-
parametric programming approach proposed in [6]. The approach starts
by convexifying the inner level problem to underestimate them by convex
functions while the variables from upper level problems are considered
as parameters. Then, the resulting convex parametric under-estimator
problem is solved using multi-parametric programming approach.
5 Example
Consider the following bilevel multi-leader multi-follower programming problem:

1
min F1 (x1 , x2 , y1 , y2 ) = x1 − y1
x1 2

1
min F2 (x1 , x2 , y1 , y2 ) = − x2 − y2
x2 2
s.t. 0 ≤ x1 , x2 ≤ 1
(5)
1
min f1 (x1 , x2 , y1 , y2 ) = y1 (−1 + x1 + x2 ) + y12
y1 2

1 2
min f2 (x1 , x2 , y1 , y2 ) = y2 (−1 + x1 + x2 ) + y2
y2 2
s.t. y1 ≥ 0, y2 ≥ 0
An equivalent tri-level single-leader multi-follower problem for (5) is given by:
min 0
z
s.t. z = (x, y)

1
min F1 (x1 , x2 , y1 , y2 ) = x1 − y1
x1 2

1
min F2 (x1 , x2 , y1 , y2 ) = − x2 − y2
x2 2
(6)
s.t. 0 ≤ x1 , x2 ≤ 1

1
min f1 (x1 , x2 , y1 , y2 ) = y1 (−1 + x1 + x2 ) + y12
y1 2

1 2
min f2 (x1 , x2 , y1 , y2 ) = y2 (−1 + x1 + x2 ) + y2
y2 2
s.t. y1 ≥ 0, y2 ≥ 0
Then (6) is transformed into the following tri-level programming problem with single
follower:
min 0
z
s.t. z = (x, y)

1 1
min F (x1 , x2 , y1 , y2 ) = x1 − x2 − y1 − y2
x1 ,x2 2 2
s.t. 0 ≤ x1 , x2 ≤ 1

1 2 1 2
min f (x1 , x2 , y) = y1 + y2 + y1 (−1 + x1 + x2 ) + y2 (−1 + x1 + x2 )
y1 ,y2 2 2
s.t. y1 ≥ 0, y2 ≥ 0
(7)
Then the third level problem in (7) can be considered as a MPP problem with parameter
x = (x1 , x2 ):
1 2 1 2
min f (x1 , x2 , y) = y1 + y2 + y1 (−1 + x1 + x2 ) + y2 (−1 + x1 + x2 )
y1 ,y2 2 2 (8)
s.t. 0 ≤ x1 , x2 ≤ 1, y1 ≥ 0, y2 ≥ 0
The Lagrangian of the problem is given by, L(x, y, λ) = 12 y12 + 12 y22 + y1 (−1 + x1 + x2 ) +
y2 (−1 + x1 + x2 ) and the KKT points are given by
⎧
⎪ ∂L ∂L
⎪
⎨ y1 ∂y = y1 (−1 + x1 + x2 ) = 0, ∂y = −1 + x1 + x2 ≥ 0, y1 ≥ 0,
1 1
⎪
⎪ ∂L ∂L
⎩ y2 = y2 (−1 + x1 + x2 ) = 0, = −1 + x1 + x2 ≥ 0, y2 ≥ 0
∂y2 ∂y2
Therefore, the parametric solution with the corresponding critical regions are given by
(Fig. 1):
Fig. 1. Critical regions for the second level problem of (7)
⎧ ⎧
⎪
⎪ 1 − x1 − x2 ⎪
⎪ 0
⎨ y ∗ (x) = ⎨ y ∗ (x) =
1 − x1 − x2 0
CR1 = and CR2 =
⎪
⎪ x 1 + x2 ≤ 1 ⎪
⎪ x 1 + x2 ≥ 1
⎩ ⎩
0 ≤ x1 , x2 ≤ 1 0 ≤ x1 , x2 ≤ 1
which can be incorporated into the second level followers problem of (7) and after
solving the resulting problems in each critical region we have the following solutions:
– In CR1 , the optimal solution is (x1 , x2 , y1 , y2 ) = (0, 0, 1, 1) with the corresponding

second level follower problem objective value F = −2.
– In CR2 , the optimal solution is (x1 , x2 , y1 , y2 ) = (0, 1, 0, 0) with the corresponding
second level follower problem objective value F = 0.
Since the objective value obtained in CR1 is better we can take (x1 , x2 , y1 , y2 ) =
(0, 0, 1, 1) as an optimal solution to the second level followers problem of (7). Therefore,
the optimal solution to the bilevel multi-leader multi-follower programming problem (5)
is (x1 , x2 , y1 , y2 ) = (0, 0, 1, 1) with the corresponding objective values F1 = −1, F2 =
−1, f2 = −0.5 and f2 = −0.5.
6 Conclusion
In a multilevel multi-leader-follower programming problem, various relationships
among multiple leaders in the upper-level and multiple followers at the lower-level
would generate different decision processes. To support decision in such problems, this
work considered a class of multilevel multi-leader multi-follower programming prob-
lem, that consist of separable terms and parameterized common terms across all objec-
tive functions of the followers and leaders, into multilevel single-leader multi-follower
programming problem. Then the reformulated problem is transformed into an equiv-
alent multilevel programs having only single follower at each level of the hierarchy.
Finally this single leader hierarchical problem is solved using the solution procedure
proposed in [5, 7]. The proposed solution approach can solve multilevel multi-leader
multi-follower problems whose objective values in all levels have common, but having
different positive weights of, nonseparable terms and with the constraints at each level
are polyhedral. However, much more research is needed in order to provide algorith-
mic tools to effectively solve the procedures. In this regard we feel it deserves further
investigations.
References
1. Başar, T., Olsder, G.: Dynamic Noncooperative Game Theory. Classics in Applied
Mathematics. SIAM, Philadelphia (1999)
2. Ehrenmann, A.: Equilibrium problems with equilibrium constraints and their appli-
cations in electricity markets. Dissertation, Judge Institute of Management, Cam-
bridge University, Cambridge, UK (2004)
3. Ehrenmann, A.: Manifolds of multi-leader Cournot equilibria. Oper. Res. Lett. 32,
121–125 (2004)
4. Facchinei, F., Pang, J.S.: Finite-Dimensional Variational Inequalities and Com-
plementarity Problems. Springer Series in Operations Research, vol. I, 1st edn.
5. Faı́sca, N.P., Saraiva, M.P., Rustem, B., Pistikopoulos, N.E.: A multi-parametric
programming approach for multilevel hierarchical and decentralised optimisation
problems. Comput. Manag. Sci. 6, 377–397 (2009)
6. Kassa, A.M., Kassa, S.M.: A branch-and-bound multi-parametric programming
approach for general non-convex multilevel optimization with polyhedral con-
straints. J. Glob. Optim. 64(4), 745–764 (2016)
7. Kassa, A.M., Kassa, S.M.: Deterministic solution approach for some classes of
nonlinear multilevel programs with multiple follower. J. Glob. Optim. 68(4), 729–
747 (2017)
8. Kulkarni, A.A.: Generalized Nash games with shared constraints: existence, effi-
ciency, refinement and equilibrium constraints. Ph.d. Dissertation, Graduate Col-
lege of the University of Illinois, Urbana, Illinois (2010)
9. Kulkarni, A.A., Shanbhag, U.V.: An existence result for hierarchical stackelberg
v/s stackelberg games. IEEE Trans. Autom. Control 60(12), 3379–3384 (2015)
10. Leyffer, S., Munson, T.: Solving multi-leader-common-follower games. Optim.
Methods Softw. 25(4), 601–623 (2010)
11. Okuguchi, K.: Expectations and stability in oligopoly models. In: Lecture Notes in
Economics and Mathematical Systems, vol. 138. Springer, Berlin (1976)
12. Pang, J.S., Fukushima, M.: Quasi-variational inequalities, generalized nash equi-
libria, and multi-leader-follower games. Comput. Manag. Sci. 2(1), 21–56 (2005)
13. Pang, J.S., Fukushima, M.: Quasi-variational inequalities, generalized nash equi-
libria, and multi-leader-follower games. Comput. Manag. Sci. 6, 373–375 (2009)
14. Sherali, H.D.: A multiple leader stackelberg model and analysis. Oper. Res. 32(2),
390–404 (1984)
15. Su, C.L.: A sequential ncp algorithm for solving equlibrium problems with equi-
librium constraints. Technical report, Department of Management Science and
Engineering, Stanford University (2004)
16. Su, C.L.: Analysis on the forward market equilibrium model. Oper. Res. Lett.
35(1), 74–82 (2007)
17. Sun, L.: Equivalent bilevel programming form for the generalized nash equilibrium
problem. J. Math. Res. 2(1), 8–13 (2010)
A Mixture Design of Experiments
Approach for Genetic Algorithm Tuning
Applied to Multi-objective Optimization
Taynara Incerti de Paula1(B) , Guilherme Ferreira Gomes2 , José Henrique de

Freitas Gomes1 , and Anderson Paulo de Paiva1
1
Institute of Industrial Engineering, Federal University of Itajubá, Itajubá, Brazil
taynaraincerti@gmail.com
2
Mechanical Engineering Institute, Federal University of Itajubá, Itajubá, Brazil
Abstract. This study applies mixture design of experiments combined

with process variables in order to assess the effect of the genetic algorithm
parameters in the solution of a multi-objective problem with weighted
objective functions. The proposed method allows defining which com-
bination of parameters and weights should be assigned to the objective
functions in order to achieve target results. A study case of a flux cored
arc welding process is presented. Four responses were optimized by using
the global criterion method and three genetic algorithm parameters were
analyzed. The method proved to be efficient, allowing the detection of sig-
nificant interactions between the algorithm parameters and the weights
for the objective functions and also the analysis of the parameters effect
on the problem solution. The procedure also proved to be efficient for the
definition of the optimal weights and parameters for the optimization of
the welding process.
Keywords: Genetic algorithm tuning · Mixture design of

experiments · Multi-objective optimization · Global criterion method
1 Introduction
Several multi-objective optimization techniques perform the scalarization of dif-

ferent responses by multiplying weighting factors to each response in order to
prioritize the most important ones, e.g., weighted sums, global criterion method,
normal boundary intersection. A common practice among researchers is to define
the multi-objective problem using one of these methods and then apply a meta-
heuristic as the search technique, in order to find the optimal solution.
When dealing with genetic algorithms (GA) as the search technique, there is
an obstacle that is the tuning of several parameters responsible for the genetic
operations and there is no consensus in the literature on how to tune them. An
Supported by PDSE-CAPES/Process No 88881.132477/2016-01.

https://doi.org/10.1007/978-3-030-21803-4_60
A Mixture DOE Approach for GA Tuning Applied to MOO 601
inadequate setup of these parameters can affect the performance of the algo-
rithm, leading to unsatisfactory solutions. In order to bypass the configuration
issues of the GA, many studies have proposed methods for the optimization of
these parameters, which include the use of adaptive techniques, meta-heuristics
or yet the use of design of experiments.
This study addresses not only the optimization of GA parameters, but also
the optimization of the weights applied to the objective functions of a MOP and
the interactions which may exist between the weights and the parameters of the
algorithm used to solve it. Therefore, this work proposes an experimental pro-
cedure that applies the design of experiments methodology, through a mixture
design with process variables, in order to evaluate of the influence of the genetic
algorithm parameters on the results of a multi-objective optimization problem
using weighted responses. By using this procedure, it is also possible to deter-
mine both the optimal weights to be used in the GCM function and the optimal
parameters to be used for tuning the GA.
To demonstrate the applicability of the proposed method, a case study of a
flux-cored arc welding (FCAW) process is used. Four input parameters are used
to configure the FCAW process and its optimization includes four responses that
describe the weld bead geometry.
2 Theoretical Fundamentals
2.1 Global Criterion Method
Several methods for optimization of multiple objectives can be found in the
literature [1]. For scalarization methods, the strategy is to combine individual
objective functions into a single function, which becomes the global objective of
the problem. In the global criterion method (GCM), the optimum solution x∗
is found by minimizing a pre-selected global criterion, F (x) [2]. In this study,
the global criterion adopted is based on the normalization of the objectives, so
they will have the same magnitude. The MCG equation is then defined as in
Eq. 1, where fi (x∗ ) is the optimal value for the individual optimization (utopia
point) of each response and fi (xmax ) is the most distant value from fi (x∗ ) (nadir
point).
p 2
fi (x∗ ) − fi (x)
Min F (x) = wi
i=1
fi (xmax ) − fi (x∗ ) (1)
s.t.: gj (x) ≤ 0, j = 1, 2, ...m
2.2 Genetic Algorithms

Inspired by the mechanism of evolution, the genetic algorithm is based on the
principles of selection and survival of the fittest and, its main premise is the idea
that by combining different pieces of important information to the problem,
new and better solutions can be found [3,4]. From a population of solutions
rather than a single solution, it is then capable of finding global optimum for
602 T. I. de Paula et al.
constrained and unconstrained optimization problems as well as one or multiple

objective functions [5].
The evolution process in the GA is performed by using a set of stochastic
genetic operators that manipulate the genetic code [6]. The three main GA
operators are the selection, recombination (crossover) and mutation, which are
controlled by different parameters (mainly functions and rates) that will affect
the proper functioning of the algorithm. The most common parameters cited in
the literature are:
a. Population size - this parameter is a determining factor in the quality of the
solution and the algorithm efficiency, since it specifies how many chromosomes
will form a generation [7]. According to [8], the greater the population size, the
greater the chance of obtaining satisfactory solutions, but a very large population
may increase the algorithm search time. At the same time, if the population size
is too small, it lead the algorithm to find a local optimum and not a global
one [9].
b. Selection type - it defines which chromosomes will be selected for the
next generation. Different methods are mentioned in the literature, such as uni-
form, stochastic uniform, ranking, tournament and the roulette selection, among
others.
c. Crossover function - it determines how the exchange of genetic informa-
tion will happen between two chromosomes. The most commonly functions are
the single-point, the double-point and the scattered crossover. When the string
length is short, the single-point crossover can be very effective. But, if the string
length is too large, it may be necessary to use a function capable of promoting
the crossover in more points of the string.
d. Crossover rate - it refers to the percent of the parent population that will
undergo a crossover operation. Values for this parameter are always between 0
and 1, since it is a probability. According to [10], while a high crossover rate
can cause the disposal of good solutions, a very low rate can give too much
attention to the parents and then stagnate the search. Yet, according to [11],
the higher the crossover rate, the more quickly new structures are introduced
into the population.
e. Mutation function - as in the operations of selection and recombination,
many mutation methods can be found in the literature. The most common are
the uniform mutation, the Gaussian mutation and the adaptive feasible (AF).
The choice of the mutation type depends largely on the constraints of the prob-
lem. The adaptive feasible mutation is indicated for solving restricted problems,
while the Gaussian mutation is contraindicated for solving problems with linear
constraints.
f. Mutation rate - it refers to the percentage of the population of parents
who will suffer the mutation operation. As the crossover rate, this parameter is
a probability and corresponds to a value between 0 and 1. According to [10], a
low level of mutation can prevent bits to converge to a single value in the entire
population, while a very high rate makes the search essentially random.
g. Number of generations - it is used as an algorithm stopping criteria. If the

algorithm reaches the specified maximum number of generations without coming
to an optimal point, the algorithm should terminate the search.
To avoid the problems regarding the parameters tuning, some adaptive tech-
niques have been developed, in which the parameters are adjusted during the
algorithm evolution. However, using such techniques, the problem ceases to be
the configuration of parameters and happens to be the parameters control while
solving the problem [12].
For those who prefer to use conventional techniques, the choice of appropri-
ate parameters is still a challenge. Some research has been done using different
methods to identify the best parameter settings for each problem. Some studies
used simple methods for comparing different combinations of parameters, like
hypotheses tests [13] while others appealed to more advanced techniques such
as the Meta-GA approach [14,15].
Design of experiments has also been used by some researchers to optimize
GA parameters and also to analyze the effects of the interactions between them
[8,10,16]. Significant interactions were found in all studies, but none of these
studies evaluated the possible interaction between the algorithm parameters and
the weights for the objective functions, since none of the optimization problems
involved scalarization optimization methods.
2.3 Mixture Design of Experiments
A mixture design is a special type of response surface experiment in which the

factors are the ingredients or components of a mixture, and the response is
a function of the proportions of each ingredient [17]. The most common mix-
ture designs are: simplex-lattice, simplex-centroid and extreme-vertices. In the
simplex-lattice design the experiments are evenly distributed throughout the
region covered by the simplex region. A simplex-lattice for q components is
associated to a polynomial model of degree m and can be referred to as a {q, m}
simplex-lattice. The proportions assumed by each component take the m + 1
1 2
equally spaced values from 0 to 1, that is, xi = 0, m ; m ; ...; 1 [18].
Including process variables in a mixture experiment can greatly increase the
scope of the experiment and it entails setting up a design consisting of the dif-
ferent settings of the process variables in combination with the original mixture
design [17,18]. The combination of the process variables can be done by setting
up a mixture design at each point of a factorial design.
The most complete form of the mixture design coupled with process variables
model can be expressed as the product of the terms in the mixture component
model and the terms in the process variable model. Equation 2 exemplifies a
quadratic model for a mixture design for three components (xi ) combined to a
full factorial design for two process variables (zi ) [17].
⎡ ⎤

3
2
3
2 3
3
E(y) = βi xi + βij xi xj + ⎣ αijk xi + αijk xi xj ⎦ zk
i=1 i<j=2 k=1 i=1 i<j=2
⎡ ⎤ (2)
3
3
+⎣ δi12 xi + δij12 xi xj ⎦ z1 z2
i=1 i<j=2
3 Experimental Method
To evaluate the influence of GA parameters to solve a multi-objective problem
and determine which of the tested parameters gives the best results for this
problem in particular, an experimental procedure was developed. In this case,
the algorithm parameters will be considered as the process variables and they
will be evaluated in a mixture design combined with process variables, where the
mixing ratios are the weights of the objective functions in the global criterion
method. Thereunto, the two-level optimization strategy used in this work was
performed according to the following procedure of five steps:
Step 1: Problem Definition
The optimization problem is defined by determining the responses, control vari-
ables and process variables. The responses must be individually optimized in
order to find the utopia and nadir points for each one of them. With these val-
ues, it is possible to define the GCM function, according to Eq. 1, which along
with the restriction functions will characterize the MOP.
Step 2: Experimental Design
In this step, the mixture design is defined by establishing the experimental matrix
in which the weights for the objective functions and the algorithm parameters
will be tested.
Step 3: Optimization Using Genetic Algorithm
The MOP defined in step 1 is solved using GA and the weights and parameters
are set up as specified in the experimental design defined in step 2. The results
found in each experiment are used for the calculation of the Global Percent-
age Error, according to Eq. 3, where: yi∗ are the values of the Pareto-optimal
responses, Ti are the targets (individual optimization solutions) and m is the
number of objectives.
m ∗
yi

GP E = Ti − 1 (3)
i=1
Step 4: Modeling of GPE Function

The DOE is analyzed and the GPE function is modeled using the OLS algorithm.
Statistical analyses are applied in order to determinate the fit of the obtained
model.
Step 5: Parameters and Weights Optimization
The GPE function is optimized and the optimal configuration of weights and
parameters is determined. The interaction between weights and parameters is
analyzed, as well as the effect of GA parameters on the GPE.
4 Case Study—The FCAW Process Optimization

To demonstrate the applicability of the proposed method, a case study of a
cladding process of depositing AISI 316L stainless steel onto AISI 1020 carbon
steel plates through flux cored arc welding (FCAW) was used. The experiments
and the responses modeling for this process were carried out by [19]. The four
input variables are wire feed rate (Wf ), voltage (V ), welding speed (S), and
distance of contact tip from work piece (N ) and the four welding outputs that
describe the weld bead geometry (bead width (W ), penetration (P ), reinforce-
ment (R) and dilution (D)), presented in Table 1.
Table 1. Responses and individual optimization results
Responses Objective Utopia Nadir

W = 10.640+0.797Wf +0.656V −1.451S −0.629N +0.270S 2 Maximization 15.576 12.451
+0.266Wf V − 0.114Wf S − 0.102V S + 0.067SN
P = 1.639 + 0.122Wf + 0.122V + 0.093S − 0.241N + 0.025Wf2 Minimization 0.828 1.411
−0.032V 2 − 0.118S 2 + 0.034Wf V + 0.076Wf S
−0.100Wf N
R = 2.597 + 0.191Wf − 0.104V − 0.223S + 0.115N + 0.034V 2 Maximization 3.342 3.055
+0.019S 2 + 0.036N 2 − 0.30Wf V − 0.023Wf N
D = 310 − 0.003Wf + 0.025V + 0.037S − 0.043N − 0.007V 2 Minimization 16.275 24.071
−0.012S 2 + 0.008Wf V + 0.005Wf S − 0.004Wf N
−0.008SN
The target values for the responses were established using the individual
constrained minimization for P and D, and the individual constrained maxi-
mization for W and R. The values found in the individual optimizations are
shown in Table 1. The MOP can be then stated as in Eq. 4, where wi is the
weight applied for each response and the problem is subjected to the experimen-
tal space constraint X T X ≤ 22 .

2
2
15.576 − W 0.828 − P
Min G = w1 + w2
−3.125 0.584

2
2
3.342 − R 16.275 − D (4)
+ w3 + w4
−0.287 7.796
s.t.:W 2 + P 2 + R2 + D2 ≤ 4.0
The GA parameters chosen for analysis as process variables for this study
were population size (Tp ), crossover rate (Tc ) and mutation type(Tm ). The levels
tested (−1e + 1) were: 20 and 100 for Tp , 0.15 and 0.85 for Tc and for Tm
the functions Gaussian (Gau) and Adaptive Feasible (AF). The experiments
matrix was based on a simplex lattice design of fourth-degree created for four
components {4, 4} with 0.05 ≤ wi ≤ 0.85, coupled with a full factorial design
Table 2. Simplex lattice design fragment
Run w1 w2 w3 w4 Tp Tc
Run Tm w1 w2 w3 w4 Tp Tc Tm
. . . . . . . .
. . . . . . . .
1 0.85 0.05 0.05 0.05 20 0.15 Gau . . . . . . . .
2 0.65 0.25 0.05 0.05 20
Gau 226 0.15 0.25 0.25 0.05 0.45 20 0.85 AF
.. .. .. .. .. .... ..
. . . . . . . 227 . 0.25 0.05 0.65 0.05 20 0.85 AF
. . . . . . . .
. . . . . . . .
98 0.05 0.25 0.45 0.25 20 0.85 Gau . . . . . . . .
99 0.05 0.25 0.25 0.45 20 0.85 Gau 279 0.05 0.05 0.25 0.65 100 0.85 AF
.. .. .. .. .. .. .. ..
. . . . . . . . 280 0.05 0.05 0.05 0.85 100 0.85 AF
(23 ) for three process variables (parameters), which resulted in 280 experiments.
Table 2 presents a fragment of the created mixture design.
The optimization problem was programmed and optimized in Matlab® ,
according to the 280 weights and parameters settings determined by the mixture
design, in 30 replicates. The means of the results found in each replicate were
used to calculate the global percentage error according to Eq. 3 and the GPE
function was modeled with the DOE analysis in Minitab® , considering a level
of significance of 5%. The final model for the GPE function (Eq. 5) presented an
adj.R2 of 95.95%, the residues were normally distributed and no lack of fit was
found.
GP E = 0.744w1 + 0.232w2 + 0.374w3 + 0.224w4 + 0.364w1 w3
− 0.219w2 w3 + 0.231w2 w3 (w2 − w3 ) − 4.240w12 w2 w3
− 3.436w12 w2 w4 − 4.725w12 w3 w4 + 0.907w1 w3 (w1 − w3 )2
+ 0.284w1 w3 (w1 − w3 )Tc − 0.179w2 w3 (w2 − w3 )Tc (5)
− 0.048w3 w4 Tp Tc + 0.150w1 w2 (w1 − w2 )Tp Tm
− 0.050w1 w2 Tc Tm + 0.562w1 w2 (w1 − w2 )2 Tc Tm
+ 0.014w3 Tp Tc Tm + 0.195w1 w2 (w1 − w2 )Tp Tc Tm
It is possible to notice in Eq. (5) that the three GA parameters tested pre-
sented significant interactions with the mixture components, which means that
changes in the algorithm configuration will impact the final results of such met-
ric. Additionally, it is possible to evaluate the influence of the GA parameters
by analyzing the main effects plots of these parameters, presented in Fig. 1,
which shows how changing the levels of the process variables impact the average
results obtained for GPE. It can be observed that the process variables levels
that provide the smallest GPE values are Tp = 100, Tc = 0, 15 and Tm = Gau.
The contour plots for different combinations of weights and parameters shown
in Fig. 2 indicate which regions contain minimum and maximum values for the
GPE function, for some specific configurations tested in the arrangement. These
plots are distributed in the cube vertices according to the GA settings. By ana-
lyzing Fig. 2, it is evident that the algorithm parameters affect the results for
the response, since the contour plots are distinct for different combinations of
parameters, despite having the same combinations of weights. It can also be
noticed the great influence of the weights assigned to the objective functions in
the results for GPE.
Fig. 1. Main effects plot for GPE
Fig. 2. Contour plots for GPE
With the GPE function properly modeled, the optimal parameters and
weights were obtained by solving the optimization problem described in Eq.
(6). The problem was solved through the desirability method in Minitab’s opti-
mizer. The optimization results were: w1 = 0.05, w2 = 0.85, w3 = 0.05, w4 =
0.05, Tp = 20, Tc = 0.85 and Tm = Gau, which resulted in a GP E = 0.232 with
a desirability composite D = 0.9687.
Min GP E
s.t.: w1 + w2 + w3 + w4 = 1
0.05 ≤ wi ≤ 0.85
(6)
20 ≤ Tp ≤ 100
0.15 ≤ Tc ≤ 0.85
− 1 ≤ Tm ≤ 1
The optimal configuration of weights and parameters was used for the solu-
tion of the initial optimization problem, the FCAW process defined in Eq. (4),
in 30 replicates, to confirm that the optimal combination of weights and param-
eters will lead to the best optimization results for this process, specifically. With
the mean results, it was possible to calculate the welding process input param-
eters and to determine the behavior of the FCAW responses (Table 3). It can
be noted from Table 3 that all four responses were stablished relatively close to
their targets, which suggests that the proposed method is very suitable for the
optimization of the FCAW process.
Table 3. FCAW optimization results
Wf V S N W P R D GP E
Optimal 9.4 28.7 22.8 23.7 13.648 0.864 3.283 17.094 0.236
Target – – – – 15.58 0.83 3.34 16.27 –
Objective – – – – Max Min Max Min –
Unit m/mm V cm/min mm mm mm mm % –
5 Conclusion
In the context of the search for efficient multiobjective optimization methods,

this study proposed an experimental procedure capable of optimizing, at the
same time, the weights assigned to the functions of a multiobjective problem
and the parameters of the genetic algorithm used for solving it, through the use
of a mixture design combined with process variables.
One of the main objectives of this study was to evaluate the existence of
interactions between the weights assigned to the responses and the parameters
used in the algorithm. Through the GPE function modeling, it was possible to
determine the existence of significant interactions between weights and param-
eters. Although it is not possible to ensure the existence of such interactions
in other scenarios, with different optimization methods and processes, it can be
concluded that the use of different combinations of weights in the optimization
may lead to the need of adjusting the GA parameters for such combination.
It is worth emphasizing that the findings of this study cannot be generalized,

but it is expected that the procedure proposed in this paper can be applied to
other processes, with other optimization methods and also in the evaluation of
different GA parameters, so that it is possible to get a correct algorithm tuning
and to be able to find more appropriate results for each type of application.
References
1. Marler, R.T., Arora, J.S.: Survey of multi-objective optimization methods for engi-
neering. Struct. Multidiscip. Optim. 26, 369–395 (2004). https://doi.org/10.1007/
s00158-003-0368-6
2. Rao, S.S.: Engineering Optimization: Theory and Practice, 4th edn. Wiley, New
Jersey (2009)
3. Heredia-Langner, A., Montgomery, D.C., Carlyle, W.M.: Solving a multistage par-
tial inspection problem using genetic algorithms. Int. J. Product. Res. 40(8), 1923–
1940 (2002). https://doi.org/10.1080/00207540210123337
4. Holland, J.H.: Adaptation in Natural and Artificial Systems. Ph.D. thesis, Univer-
sity of Michigan Press (1975). https://doi.org/10.1086/418447
5. Zain, A.M., Haron, H., Sharif, S.: Application of GA to optimize cutting conditions
for minimizing surface roughness in end milling machining process. Expert Syst.
Appl. 37(6), 4650–4659 (2010). https://doi.org/10.1016/j.eswa.2009.12.043
6. Fleming, P., Purshouse, R.: Evolutionary algorithms in control systems engineer-
ing: a survey. Control Eng. Pract. 10(11), 1223–1241 (2002). https://doi.org/10.
1016/S0967-0661(02)00081-3
7. Maaranen, H., Miettinen, K., Penttinen, A.: On initial populations of a genetic
algorithm for continuous optimization problems 37 (2007). https://doi.org/10.
1007/s10898-006-9056-6
8. Candan, G., Yazgan, H.R.: Genetic algorithm parameter optimisation using
Taguchi method for a flexible manufacturing system scheduling problem. Int.
J. Product. Res. 53(3), 897–915 (2014). https://doi.org/10.1080/00207543.2014.
939244
9. Weise, T., Wu, Y., Chiong, R., Tang, K., Lässig, J.: Global versus local search:
the impact of population sizes on evolutionary algorithm performance. J. Global
Optim. 1–24 (2016). https://doi.org/10.1007/s10898-016-0417-5
10. Ortiz, F., Simpson, J.R., Pignatiello, J.J., Heredia-langner, A.: A genetic algo-
rithm approach to multiple-response optimization. J. Qual. Technol. 36(4), 432–
450 (2004)
11. Grefenstette, J.: Optimization of control parameters for genetic algorithms. IEEE
Trans. Syst. Man Cybern. 16(February), 122–128 (1986)
12. Eiben, A.E., Smit, S.K.: Evolutionary algorithm parameters and methods to tune
them. In: Autonomus Search, Chap. 2, pp. 15–36. Springer, Heidelberg (2012).
https://doi.org/10.1007/978-3-642-21434-9
13. Alajmi, A., Wright, J.: Selecting the most efficient genetic algorithm sets in solving
unconstrained building optimization problem. Int. J. Sustain. Built Environ. 3(1),
18–26 (2014). https://doi.org/10.1016/j.ijsbe.2014.07.003
14. Fernandez-Prieto, J.a., Canada-Bago, J., Gadeo-Martos, M.a., Velasco, J.R.: Opti-
misation of control parameters for genetic algorithms to test computer networks
under realistic traffic loads. Appl. Soft Comput. J. 12(4), 1875–1883 (2012).
https://doi.org/10.1016/j.asoc.2012.04.018
15. Núñez-Letamendia, L.: Fitting the control parameters of a genetic algorithm: an

application to technical trading systems design. Eur. J. Oper. Res. 179, 847–868
(2007). https://doi.org/10.1016/j.ejor.2005.03.067
16. Costa, C.B.B., Rivera, E.A.C., Rezende, M.C.A.F., Maciel, M.R.W., Filho, R.M.:
Prior detection of genetic algorithm significant parameters: coupling factorial
design technique to genetic algorithm. Chem. Eng. Sci. 62, 4780–4801 (2007).
https://doi.org/10.1016/j.ces.2007.03.042
17. Myers, R., Montgomery, D., Anderson-Cook, C.: Response Surface Methodology,
3 edn. (2009)
18. Cornell, J.A.: A Primer on Experiments with Mixtures, 3rd edn. Wiley, New Jersey
(2011)
19. Gomes, J.H.F., Paiva, A.P., Costa, S.C., Balestrassi, P.P., Paiva, E.J.: Weighted
multivariate mean square error for processes optimization: a case study on flux-
cored arc welding for stainless steel claddings. Eur. J. Oper. Res. 226(3), 522–535
(2013). https://doi.org/10.1016/j.ejor.2012.11.042
A Numerical Study on MIP Approaches
over the Efficient Set
Kuan Lu1(B) , Shinji Mizuno1 , and Jianming Shi2

1
Tokyo Institute of Technology, Tokyo, Japan
{lu.k.aa,mizuno.s.ab}@m.titech.ac.jp
2
Tokyo University of Science, Tokyo, Japan
shi@rs.tus.ac.jp
Abstract. This paper concerns an optimization problem over the effi-

cient set of a multiobjective linear programming problem. We propose an
equivalent mixed integer programming (MIP) problem and compute an
optimal solution by solving the MIP problem. Compared with the previ-
ous MIP approach by Sun, the proposed approach relaxes an assumption
which lets a more general class of problem can be solved and reduces the
size of the MIP problem. By conducting the experiments on a well-known
application of the OE problem, the minimum maximal flow problem, we
find that the proposed approach is more accurate and faster. The MIP
problem can be efficiently solved by current state-of-the-art MIP solvers
when the objective function is convex or linear.
Keywords: Gloal optimization · Multiobjective programming ·

Efficient set · Linear complementarity conditions ·
Mixed integer programming
1 Introduction
This paper concerns optimization over the efficient set of a multiobjective linear
programming problem, which is formulated as

minimize Cx
(1)
subject to Ax b, x 0,
where x ∈ Rn is a vector of variables, C ∈ Rp×n , A ∈ Rm×n , and b ∈ Rm are

data, 0 ∈ Rn is a vector of zeros, and p, m, and n are positive integers. Let
X := {x ∈ Rn |Ax b, x 0}
be the set of feasible solutions. An efficient solution of the multiobjective linear

programming problem (1) is defined as follows.
Definition 1 (Efficient Solution). An x ∈ X is called an efficient solution

of (1) if there is no x ∈ X such that Cx Cx and Cx = Cx .
https://doi.org/10.1007/978-3-030-21803-4_61
612 K. Lu et al.
The efficient solution is also called Pareto optimal solution. Let E denote the set
of all the efficient solutions of the problem (1). Then the optimization problem
over the efficient set (OE problem) is defined as

minimize φ(x)
(2)
subject to x ∈ E,
where φ : Rn → R is an objective function.

The OE problem (2) can be applied to real problems such as the min-
imum maximal matching problem [15], the minimum maximal flow problem
[10,12,15,16], and the least distance problem in data envelopment analysis [11].
Since the efficient set is generally nonconvex, the OE problem also plays an
important role in theoretical research of multiobjective programming and global
optimization. Since Philip [13] firstly considered the OE problem in 1972, there
are many works on the problem. There are two kind of traditional algorithms,
namely the income space algorithms and the outcome space algorithms. The
income space algorithms were surveyed by Yamamoto [19], and they were clas-
sified into adjacent vertex search algorithms [6,13], nonadjacent vertex search
algorithms [3], branch-and-bound methods [2], Lagrangian relaxation methods
[7], dual approaches [18], and bisection algorithms [14]. Because the dimension of
the outcome space is smaller than that of the income space, the outcome space
algorithms have been proposed to accelerate the computation speed in [4,5,9].
The OE problem can be transformed into a kind of difference-of-convex (DC)
functions programming problem [1]. In fact, all traditional algorithms are based
on DC optimization or its extensions as shown in Tuy [8].
Recently, Sun [17] proposed a mixed integer programming (MIP) approach
to the OE problem under some assumptions on the problem. Lu et al. [10]
transformed the minimum maximal flow problem (an application of the OE
problem) into a MIP problem, and showed computational advantages over vertex
search methods and DC algorithms which are also popular algorithms to solve
the minimum maximal flow problem and the OE problem.
In this paper, the approach of Lu et al. [10] will be generalized to the OE
problem which transform the problem into a MIP problem and we propose to
compute the optimal solution by solving the MIP problem. Compare to Sun
[17], the proposed approach relaxes assumptions and reduces the size of the MIP
problem. The MIP problem can be efficiently solved by current state-of-the-art
MIP solvers when the objective function is convex or linear.
In Sect. 2, we will introduce Sun’s main result and ours, then we will compare
them. In Sect. 3, we will make a short experiment on one of the most classical
applications of OE problem to compare the two MIP approaches.
2 Mixed Integer Programming Approaches

In this section, after introducing Sun’s result [17] on the problem (2), we will
show our main result and make a comparison between the results. Sun’s result
is obtained under the following two assumptions on the problems (1) and (2):
A Numerical Study on MIP Approaches over the Efficient Set 613
Assumption 1 For the problem (1), there is an i ∈ {1, 2, . . . , p} such that c

i x
is one-to-one on X, where ci is the ith row of the matrix C.
Assumption 2 The problem (2) has an optimal solution x∗ .
We note that Assumption 1 is very strong, because it is not satisfied if the dimen-
sion of the feasible region X is larger than 1. Hence, Sun’s result is valid only
when the dimension of the feasible region X is at most 1. We will give examples
in Sect. 4 to show what could happen when this assumption fails. Assumption
2 is rather standard and used in the most analysis of the problem (2). Please
note that the problem (1) is feasible and the efficient set E is nonempty under
Assumption 2.
We state Sun’s main result [17] in the next proposition.
Proposition 1. (Sun [17]) Under Assumptions 1 and 2, an optimal solution of

the OE problem (2) is computed by solving the following mixed integer program-
ming (MIP) problem:

minimize φ(x)

b A
subject to s = − x,
β C(i)

A
r = ci + u, (3)
C(i)

β = C(i) y, Ay b,

0 u θα1 , 0 s θ(1 − α1 ),

0 x θα2 , 0 r θ(1 − α2 ),

y 0, α ∈ {0, 1}m+p−1 , α ∈ {0, 1}n ,
1 2
where ci is one-to-one on X, C(i) = (c

1 , ..., ci−1 , ci+1 , ..., cp ) , and θ > 0 is a
sufficiently large number.
We state our main result as the next theorem.

Theorem 1. Under Assumption 2, an optimal solution of the OE problem (2)
is computed by solving the following mixed integer programming (MIP) problem:

minimize φ(x)

subject to Ax + z = b,

A u + C v − w = −C 1,

z θα, u θ(1 − α), (4)

x θβ, w θ(1 − β),

α ∈ {0, 1}m , β ∈ {0, 1}n ,

x, z, u, v, w 0,
614 K. Lu et al.
where θ is a sufficiently large number.

Proof. The sketch proof of Theorem 1 can be found in [11].

The MIP problem (4) can be efficiently solved by current state-of-the-art
MIP solvers when the objective function is convex or linear.
Please note that we do not make the Assumption 1 in this theorem. Moreover,
compared with the problem (3), the size of the problem (4) is smaller, that is,
the numbers of constraints and binary variables in (4) are less than those in (3).
So we will be able to solve the problem (4) much faster than (3) in practical
computation.
3 Preliminary Computational Experiments

In this section, we will make a simple experiment to compare the property of
the two approaches. We use INTLINPROG of Gurobi 8.00 and Matlab 2017a to
solve instances of the problem. The OS is Windows 10, CPU is 3.20 GHz Intel
Core i5-6500, and the memory is 32GB.
The experiment is based on the minimum maximal flow problem which is
one of the most important application which is proposed by [15], and introduced
in the well-known survey [19]. The test instances are randomly generated by the
author’s method in [10]’s Sect. 4.1, and θ is 106 in both approaches. Also, we
conduct experiments with a DC approach proposed by [12].
The results of computational experiments are shown in Table 1. The first and
the second column show the sizes of m and n, respectively. The third, the fifth
and the seventh column show the mean running time of 10 different instances
in seconds of two approaches, respectively. The fourth, the sixth and the eighth
column show the number of instances whose exact optimal solution has been
obtained.
Table 1. Simple Experiments
m n Our(s) # of instances Sun [17](s) # of instances DC [12](s) # of instances

(opt. sol. found) (opt. sol. found) (opt. sol. found)
10 20 0.2467 10 0.3038 2 0.0663 10

20 40 0.3099 10 0.3363 0 0.0774 9
40 80 0.3874 10 0.4037 0 0.1994 8
80 160 0.3964 10 0.3786 0 0.4601 8
100 200 0.3844 10 0.3808 0 0.9351 8
200 400 0.3846 10 0.4556 0 4.7539 8
400 800 0.4808 10 0.5833 0 31.9423 7
To implement Sun’s approach, we assume that there is some i which can let
the problem solved by Sun’s approach. So we test a random i ∈ {1, . . . , n} to
A Numerical Study on MIP Approaches over the Efficient Set 615
set up C(i) and ci . In fact, obviously, the problem does not always satisfy the
Assumption 1 for the system
Ax = 0, 0 x c, xi = t
where A is an incidence matrix of a network, c is the capacity, and t is a given

value, may have several solutions instead of the only solution. Since the Assump-
tion 1 is too tight, and may not be satisfied in the most of the OE problems,
Sun’s approach almost failed to find an exact optimal solution. Also, our running
time are less than those of Sun’s.1 In fact, instead of the 106 , a more accurate θ
has been used in [10], and solved problems with 15,000 objective functions and
numbers of constraints with 100% exact optimal solution. Consider the prob-
lem size, it seems that these instances are hard to be solved by outcome space
algorithms, see the experiment part of [9].2
4 Conclusion
In this paper, we propose that an optimal solution of the optimization problem
(2) over the efficient set of the multiobjective linear programming problem (1)
is computed by solving the MIP problem (4). Compared with the previous MIP
approach by Sun [17], our approach does not make Assumption 1 on the problem
(1) and reduce the size of the MIP problem. By the preliminary computational
experiments, we observed that our approach can solve the linear cases of the OE
problem more accurate and faster.
Acknowledgment. This research is supported in part by Grant-in-Aid for Science

Research (A) 26242027 and Grant-in-Aid for Scientific Research (C) 17K01272 of Japan
Society for the Promotion of Science.
References
1. An, L.T.H., Tao, P.D., Muu, L.D.: Numerical solution for optimization over the
efficient set by dc optimization algorithms. Oper. Res. Lett. 19(3), 117–128 (1996)
2. Benson, H.P.: An algorithm for optimizing over the weakly-efficient set. Eur. J.
Oper. Res. 25(2), 192–199 (1986)
3. Benson, H.P.: A finite, nonadjacent extreme-point search algorithm for optimiza-
tion over the efficient set. J. Optim. Theory Appl. 73(1), 47–64 (1992)
4. Benson, H.P.: An outcome space algorithm for optimization over the weakly effi-
cient set of a multiple objective nonlinear programming problem. J. Global Optim.
52(3), 553–574 (2012)
1
In fact, since all 0–1 linear programmings are a kind of DC, but the reverse is not
true. So the specialized tools on MIP may able to achieve a better performance.
2
There are more specialized experiments which satisfied the Assumption 1 been con-
ducted, and can be found in author’s homepage. Even in those cases, the accuracy of
both approaches are the same while the running times of our approach are still less
than the previous approach. But the feasible region is a 1-dimensional space which
is so special, so we do not put these instances in this paper.
616 K. Lu et al.
5. Benson, H.P., Lee, D.: Outcome-based algorithm for optimizing over the efficient
set of a bicriteria linear programming problem. J. Optim. Theory Appl. 88(1),
77–105 (1996)
6. Bolintineanu, S.: Minimization of a quasi-concave function over an efficient set.
Math. Program. 61(1–3), 89–110 (1993)
7. Dauer, J.P., Fosnaugh, T.A.: Optimization over the efficient set. J. Global Optim.
7(3), 261–277 (1995)
8. Hoang, T.: Convex Analysis and Global Optimization. Springer (2016)
9. Liu, Z., Ehrgott, M.: Primal and dual algorithms for optimization over the efficient
set. Optimization 67(10) 1–26 (2018)
10. Lu, K., Mizuno, S., Shi, J.: A mixed integer programming approach for the mini-
mum maximal flow. J. Oper. Res. Soc. Jpn 64(4), 261–271 (2018)
11. Lu, K., Mizuno, S., Shi, J.: Optimization over the efficient set of a linear multiob-
jective programming: Algorithm and applications (to appear). RIMS Kôkyûroku
(2018)
12. Muu, L.D., Thuy, L.Q.: On dc optimization algorithms for solving minmax flow
problems. Math. Methods Oper. Res. 80(1), 83–97 (2014)
13. Philip, J.: Algorithms for the vector maximization problem. Math. Program. 2(1),
207–229 (1972)
14. Phong, T.Q., Tuyen, J.: Bisection search algorithm for optimizing over the efficient
set. Vietnam J. Math. 28, 217–226 (2000)
15. Shi, J., Yamamoto, Y.: A global optimization method for minimum maximal flow
problem. Acta Math. Vietnam. 22(1), 271–287 (1997)
16. Shigeno, M., Takahashi, I., Yamamoto, Y.: Minimum maximal flow problem: an
optimization over the efficient set. J. Global Optim. 25(4), 425–443 (2003)
17. Sun, E.: On optimization over the efficient set of a multiple objective linear pro-
gramming problem. J. Optim. Theory Appl. 172(1), 236–246 (2017)
18. Thach, P., Konno, H., Yokota, D.: Dual approach to minimization on the set of
pareto-optimal solutions. J. Optim. Theory Appl. 88(3), 689–707 (1996)
19. Yamamoto, Y.: Optimization over the efficient set: overview. J. Global Optim.
22(1–4), 285–317 (2002)
Analytics-Based Decomposition of a Class
of Bilevel Problems
Adejuyigbe Fajemisin1 , Laura Climent2 , and Steven D. Prestwich3(B)

1
School of Computing, National College of Ireland, Dublin, Ireland
ade.fajemisin@ncirl.ie
2
Computer Science Department, Cork Institute of Technology, Cork, Ireland
laura.climent@cit.ie
3
Insight Centre for Data Analytics, University College Cork, Cork, Ireland
s.prestwich@cs.ucc.ie
Abstract. This paper proposes a new class of multi-follower bilevel

problems. In this class the followers may be nonlinear, do not share
constraints or variables, and are at most weakly constrained. This allows
the leader variables to be partitioned among the followers. The new class
is formalised and compared with existing problems in the literature. We
show that approaches currently in use for solving multi-follower problems
are unsuitable for this class. Evolutionary algorithms can be used, but
these are computationally intensive and do not scale up well. Instead we
propose an analytics-based decomposition approach. Two example prob-
lems are solved using our approach and two evolutionary algorithms, and
the decomposition approach produces much better and faster results as
the problem size increases.
Keywords: Bilevel · Analytics · Clustering · Decomposition
1 A New Class of Bilevel Problems
In bilevel optimisation problems, an inner (or follower) optimisation problem

constrains an outer (or leader) optimisation problem. As these component prob-
lems interact with and affect each other, the solution of bilevel problems is dif-
ficult in the general case [5]. In Bilevel Multi-Follower (BLMF) problems there
may be several followers, and multi-leader problems are also known.
For a BLMF problem with Q followers, let x represent the leader decision
vector, and y q the decision vector for follower q (q = 1 . . . Q). The leader chooses
a strategy x, following which each follower selects its own strategy in response
to x. The BLMF problem may be either cooperative, partially cooperative or
uncooperative, depending on the leader and follower objectives. Based on the
type of interaction between followers, nine classes of linear BLMF problems are
identified in [13]. Problems in which the followers do not share objectives or
constraints are known as “independent” and take the form:

https://doi.org/10.1007/978-3-030-21803-4_62
618 A. Fajemisin et al.
minx,y 1 ...y Q F (x, y 1 , . . . , y Q )

s.t. G(x, y 1 , . . . , y Q ) ≤ 0
where each y q (q = 1, . . . , Q) solves
miny q f (x, y 1 , . . . , y Q )
s.t. g(x, y 1 , . . . , y Q ) ≤ 0
Several researchers have worked on bilevel optimisation with multiple indepen-

dent followers [13,27]. However, we strengthen this independence condition to
one we call strong independence.
Definition 1. A Bilevel Problem with Multiple Strongly-Independent Followers

(BPMSIF) is one in which:
(i) the followers do not share each others’ follower or leader variables, so that
x can be partitioned into q parts: xq (q = 1 . . . Q);
(ii) follower problems fq (xq , y q ) are allowed to be integer or non-linear;
(iii) variables from different follower problems are not tightly mutually con-
strained.
In (iii) weak constraints such as a single linear inequality are allowed. Thus the
BPMSIF has the form:
minx 1 ...x Q ,y 1 ...y Q F (x1 , . . . , xQ , y 1 , . . . , y Q )
s.t. G(x1 , . . . , xQ , y 1 , . . . , y Q ) ≤ 0
where each y q (q = 1, . . . , Q) solves
(1)
miny q fq (xq , y q )
s.t. gq (xq , y q ) ≤ 0
xq ∈ Xq , y q ∈ Yq
where F, fq may be any (possibly non-linear) objective functions, G, gq may be

any set of (possibly non-linear) constraints, the G constraints are weak, and
Xq , Yq may be vectors of any variable domains (real, integer, binary, or richer
Constraint Programming domains such as set variables).
Problem (1) satisfies the features of a BPMSIF. Firstly. each follower problem
here can be seen to be a function of only its variables y q and a portion of the
the leader’s variables xq . Secondly, G(x1 , . . . , xQ , y 1 , . . . , y Q ) ≤ 0 is weak and
Q
may, for example, take the form of a simple weighted sum such as q Bq yq ≤ b,
where the Bq and b are constants. The BPMSIF is different from multi-leader
problems such as those of [7,12,21]. Its constraint region is:
Ω = {(x1 , . . . , xQ ,y 1 , . . . , y Q ) ∈ X1 . . . × XQ × Y1 × . . . × YQ :
G(x1 , . . . , xQ , y 1 , . . . , y Q ) ≤ 0, g(xq , y q ) ≤ 0, q = 1, . . . , Q}
The projection of Ω onto the leader’s decision space is:
Ω(X) = {xq ∈ Xq :∃y q ∈ Yq : G(x1 , . . . , xQ , y 1 , . . . , y Q ) ≤ 0,

g(xq , y q ) ≤ 0, q = 1, . . . , Q}
Analytics-Based Decomposition of a Class of Bilevel Problems 619
The feasible set for follower q is affected by a corresponding part xq of a given

leader decision vector so that:
Ωq (xq ) = {y q : (xq , y q ) ∈ Ω}
Each follower’s rational reaction set is given as:
Ψq (xq ) = {y q ∈ Yq : y q ∈ argminfq (xq , y q ) | y q ∈ Ωq (xq )}
Finally, the inducible region (IR) is:
IR = {(x1 , . . . , xQ , y 1 , . . . , y q ) :(x1 , . . . , xQ , y 1 , . . . , y q )
∈ Ω, y q ∈ Ψq (x), q = 1, . . . , Q}
As in standard bilevel programming min and argmin have been used without
loss of generality: each subproblem may involve maximisation. Note that the
follower problems need not be linear, or even optimisation problems: follower q
can be any algorithm that computes y q from xq .
Classical and evolutionary approaches have been applied to single- and multi-
follower problems. [13] presents a general framework and solutions for nine classes
of multi-follower problem, but none are applicable to the BPMSIF. Classical
methods for multi-follower problems include Kuhn-Tucker approaches [14,15,22]
and branch-and-bound algorithms [16]. The Kth-best approach (or some modifi-
cation) has also been applied to multi-follower problems [23,24,27,28]. [4] refor-
mulate a problem with multiple followers into one with one leader and one fol-
lower, by replacing the lower levels with an equivalent objective and constraint
region. This method also cannot be applied to the BPMSIF, as neither its objec-
tives nor its inducible region are equivalent to the problem class of [4]. Addition-
ally, the methods proposed in [4,13] assume that the followers are linear, which
is not the case with the BPMSIF.
Most classical methods for handling bilevel problems require assumptions of
smoothness, linearity or convexity, while the BPMSIF makes no such assump-
tions. Evolutionary and meta-heuristic techniques also do not make these
assumptions [2,9,11] but most are computationally intensive nested strategies.
They are efficient for smaller problems but do not scale up well to large-scale
problems. In contrast, our approach scales up well as the number of followers
increases (see Sect. 3).
2 An Analytics-Based Decomposition for the BPMSIF

For each follower q a large number S of feasible solutions for the leader vector xq
associated with that follower are generated, using Monte Carlo simulation. To
avoid bias the xq are generated using Hypersphere Point Picking [18,19], which
uniformly samples from a vector space. This results in a set X sq (s = 1 . . . S,
q = 1 . . . Q). The associated follower problems are then solved using the X sq
to obtain a corresponding set of follower vectors Y sq . We now have multiple
potential leader solutions, together with their corresponding follower solutions

for each follower problem fq (xq , y q ).
In order to model and solve the BPMSIF as an Integer Linear Program (ILP),
the large number of potential solutions X sq must be reduced to a manageable
size, which we do via k-medoids clustering [1]. Unlike k-means where centroids
are means of data points, k-medoids selects actual data points. Once the large
set X sq has been clustered using K clusters, the medoids from each cluster are
selected to represent the large generated set. The corresponding Y kq are then
selected from Y sq so that we now have a smaller representative set of assign-
ments X kq and Y kq . The most common algorithm for k-medoid clustering is the
Partitioning Around Medoids (PAM) algorithm [1]. This is inefficient on very
large datasets, so instead we use the CLARA algorithm which is a combination
of PAM and random sampling [10,26]. The BPMSIF can now be transformed
into a standard optimisation problem:
minx 1 ...x Q F (x1 . . . xQ , y 1 . . . y Q )
s.t. G(x1 . . . xQ , y 1 . . . y Q ) ≤ 0
xq = X kq → y q = Y kq (q = 1 . . . Q, k = 1 . . . K)
xq ∈ {X kq | k = 1 . . . K} (q = 1 . . . Q)
y q ∈ {Y kq | k = 1 . . . K} (q = 1 . . . Q)
The constraint xq = X kq → y q = Y kq ensures that if xq is assigned a value
in X kq then y q is assigned the corresponding value in Y kq . This constraint can
either be linearised using the big-M approach, or implemented directly using
CPLEX indicator constraints [8].
3 Numerical Examples
To illustrate and evaluate our decomposition approach, we use two example
problems. Monte Carlo simulation and clustering were done in Java and R (using
the CLARA package [17]) respectively. The CPLEX 12.6 solver was also used on
a 3.0 GHz Intel Xeon Processor with 8 GB of RAM.
3.1 A Benchmark Problem

The first problem considered is Example 2 from [3], and is a two-follower problem:
max F (x, y 1 , y 2 ) = (200 − y11 − y21 )(y11 + y21 )
+(160 − y12 − y22 )(y12 + y22 )
s.t. x1 + x2 + x3 + x4 ≤ 40
0 ≤ x1 ≤ 10, 0 ≤ x2 ≤ 5, 0 ≤ x3 ≤ 15, 0 ≤ x4 ≤ 20
(2)
min f1 (y 1 ) = (y11 − 4)2 + (y12 − 13)2 s.t.
0.4y11 + 0.7y12 ≤ x1 , 0.6y11 + 0.3y12 ≤ x2 , 0 ≤ y11 , y12 ≤ 20
min f2 (y 2 ) = (y21 − 35)2 + (y22 − 2)2 s.t.

0.4y21 + 0.7y22 ≤ x3 , 0.6y21 + 0.3y22 ≤ x4 , 0 ≤ y21 , y22 ≤ 40
This is a BPMSIF as the followers are strongly independent: the followers do

not share each others’ follower or leader variables, and the follower problem
variables are not mutually constrained. The leader vector x = (x1 , x2 , x3 , x4 ) is
partitioned among the followers with variables (x1 , x2 ) occurring in follower 1
and (x3 , x4 ) in follower 2. The variables y1 = (y11 , y12 ) and y2 = (y21 , y22 ) are
also computed by followers 1 and 2 respectively.
To solve this problem using the analytics-based decomposition method,
denote (x1 , x2 ) by a vector λ1 and (x3 , x4 ) by a vector λ2 . A large number
S of assignments for λ1 and λ2 which satisfy the bounds of the x’s are gener-
ated ([18,19]), and denoted by Λs1 and Λs2 (s = 1 . . . S) respectively. For each
λ1 in Λs1 the corresponding follower problem f1 is solved as an ILP, obtaining
assignments Y s1 ; similarly for Y s2 . Next, the Y s1 vectors are clustered using
k-medoids to get the most diverse set of assignments Y k1 , (k = 1 . . . K). The
Λk1 vectors that correspond to the Y k1 are then selected. The same is done for
Y s2 to obtain Y k2 along with its corresponding Λk2 . Using this decomposition,
problem (2) can now be rewritten as a standard optimisation problem:
max F (x, y 1 . . . y 2 ) = (200 − y11 − y21 )(y11 + y21 )

+(160 − y12 − y22 )(y12 + y22 )
s.t. λ11 + λ12 + λ21 + λ22 ≤ 40
(3)
uk = 1 → λ1 = Λk1 , uk = 1 → y 1 = Y k1 k = 1...K
vk = 1 → λ2 = Λk2 , vk = 1 → y 2 = Y k2 k = 1...K
K K
k uk = 1, k vk = 1
where λ11 = x1 , λ12 = x2 , λ21 = x3 and λ22 = x4 . This model can be linearised
using the big-M approach, but the ILP is solved faster when CPLEX indica-
tor constraints are used.1 The binary variables uk and vk ensure that only one
assignment each is selected from Λk1 and Y k1 , and from Λk2 and Y k2 respec-
tively. The λ11 + λ12 + λ21 + λ22 ≤ 40 constraint ensures that an (x1 , x2 ) and
an (x3 , x4 ) that satisfy the original constraints on the x are selected.
In experiments, as K increases better solutions were found, with the highest
value of 6594.05 obtained when K = 160 giving x = (8.13, 3.80, 11.23, 16.82),
y 1 = (0.74, 11.20) and y 2 = (28.04, 0.00) (rounded to 2 decimal places). The
clustering time when K = 160 is 234.53 seconds. The solution is 0.09% less than
optimal, but the strength of our approach is in its ability to handle large-scale
problems, as demonstrated next.
3.2 A Large-Scale Problem
In this experiment, a problem with arbitrarily many followers is evaluated. The

problem is also evaluated for the optimistic case in which the followers’ solutions
lead to the best objective function value for the leader.
1
These are a way of expressing if-else relationships among variables [8].
Q Q
max q aq xq + q bq y q
N
s.t. xq ∈ R q = 1...Q
xqn ≤ xmax
qn q = 1 . . . Q, n = 1 . . . N
(4)
y q ∈ argmin cq xq + dq y q q = 1 . . . Q
N N
s.t. n yqn ≤ n xqn q = 1...Q
max
eqn xqn ≤ yqn ≤ yqn q = 1 . . . Q, n = 1 . . . N

where q aq xq = q n aqn xqn , x and y are the variables controlled by the
leader and followers respectively, and Q is the total number of followers. Both
the x and y are vectors of real numbers. The leader variables are partitioned
among the followers such that each follower contains one xq each, and each xq
is of size n. Each component of the vector xqn is constrained to be ≤ a given
upper bound xmax qn . aq , bq , cq , dq and eq are vectors of constants.
The decomposition approach outlined in Sect. 2 was used to decompose the
problem, which is then written as:
Q N Q N
max q n aqn xqn + q n bqn yqn
s.t. xqn − Xkqn ≤ M (1 − ukq ) k = 1 . . . K, q = 1 . . . Q, n = 1 . . . N
Xkqn − xqn ≤ M (1 − ukq ) k = 1 . . . K, q = 1 . . . Q, n = 1 . . . N
yqn − Ykqn ≤ M (1 − ukq ) k = 1 . . . K, q = 1 . . . Q, n = 1 . . . N
Ykqn − yqn ≤ M (1 − ukq ) k = 1 . . . K, q = 1 . . . Q, n = 1 . . . N
K
k ukq = 1 q = 1...Q
ukq ∈ {0, 1} k = 1 . . . K, q = 1...Q
where M is a sufficiently large constant.

Evaluation The values used for the problem are N = 6, xmin qn = 0, xqn
max
= 10,
max
yqn = 10, (∀q, n). aqn , bqn , cqn , and dqn are Gaussian random real variables in
[0.0, 15.0), [0.0, 20.0), [−10.0, 10.0) and [−12.0, 12.0) respectively. eqn is a uniform
random real variable in [0.0, 1.0). The number of followers Q was varied between
10–1000, and the problem was solved using both the decomposition approach
(using S = 1000, K = 30 for each follower) and two genetic algorithms, and
the results are shown in Figs. 1 and 2. The first genetic algorithm is the Nested
Bilevel Evolutionary Algorithm (N-BLEA) used in [25]. The second is the Multi-
Follower Genetic Algorithm (MFGA) described in Algorithm 1 and designed for
this problem.
N-BLEA Parameters In order to select the parameters to use, the problem
with 100 followers (Q = 100) was first solved while varying some algorithm
parameters. The number of parents μ and number of offspring λ (μ = λ) were
varied from 3–8. For each of these values, the number of generations (maxGens)
was also varied from 50–200 in steps of 50. This operation was run 10 times for
each value of μ, λ and maxGens, and the average objective value was recorded.
It was seen that the following settings produced the best solutions on average:
μ = λ = 8, number of generations maxGens = 150, tournamentSize = 5,
number of random individuals added to pool r = 2, crossoverP robability = 0.9
and mutationP robability = 0.1. The constraint handling method used by the
Algorithm 1. Genetic Algorithm for multi-follower bilevel problems

1: Generate initial population of size popSize of leader individuals xq ∀Q
2: for each follower q:
3: for each leader individual in population:
4: Solve follower problem to get a population of follower solutions
5: end for
6: end for
7: Calculate f itnessF unction for each member of the population
8: while g < maxGens:
9: Evolve Population:
10: Select elite leader individuals from population
11: Generate new leader individuals using selection, crossover, mutation
12: Add elite and newly generated individuals to create new population
13: for each follower q:
14: for each leader individual in new population:
15: Solve follower problem to get a population of follower solutions
16: end for
17: end for
18: Evaluate fitness of new population
19: g ←g+1
20: end while
21: Return xq ∀Q with best fitness
algorithm is given in [6], and the variance-based termination criteria was set to
0.000001.
MFGA Parameters These were also varied using 100 followers. The popula-
tion size popSize was varied from 30–90, and the maximum number of gener-
ations maxGens from 50–500. The MFGA parameters selected were therefore:
maxGens = 500, popSize = 50. This population size was selected because,
although there is little difference between its objective value and the best objec-
tive at popSize = 80, the difference in time taken is almost 50% less. Uni-
form crossover with a crossover rate of 0.5 (50%) was used. Other parameters
are eliteP ercentage = 0.20, tournamentSize
Q N= 5, mutationRate = 0.015 and
Q N
f itnessF unction = q n aqn xqn + q n bqn yqn .
Comparing all 3 approaches For both N-BLEA and MFGA, each problem
size was solved 10 times, and the average objective values and solution times
were recorded. It should be noted that the poor performance of N-BLEA is
due to the operation of its crossover operator which is additive in nature, and
frequently violates the bounds of the vectors. This crossover operator results in
offspring which are frequently infeasible, and are thus heavily penalised by the
constraint handling scheme. MFGA was designed to avoid this problem: since
vector generation is done using Hypersphere Point Picking with the appropriate
boundaries, it always produces feasible offspring.
For 10–100 followers, the solution found by the MFGA was better in 7 out
of 10 of the cases, though the decomposition approach finds a close solution
in a fraction of the time (Fig. 1). However, as the problems get larger (Q from
100–1000) the decomposition approach is much better in terms of both solution
quality and runtime (Fig. 2). This demonstrates the scalability of our approach.
Reduction of a very large set of potential solutions to a much smaller (but highly
representative) set using medoids allows the ILP to choose the best solution from
a vast number of possibilities.
Fig. 1. Comparing approaches: objectives and timings for Q = 10 to 100
Fig. 2. Comparing approaches: objectives and timings for Q = 100 to 1000
4 Conclusions
In this paper a new class of multi-follower bilevel problems, the BPMSIF, was
proposed. In this problem no variables are shared between followers, so that
the leader variables can be partitioned among the followers. Also, the follower
problems are also allowed to be integer or non-linear, and variables from different
follower problems are only connected through weak constraints. To solve the
BPMSIF an analytics-based decomposition approach was developed, and tested
on two numerical examples. The first example showed that the decomposition
approach is competitive even for small bilevel problems. More importantly, the
second example showed that the decomposition approach is much more scalable
as the number of followers increases, in terms of both runtime and solution
quality. Furthermore, we have applied these techniques to large scale cutting
stock real problem: forestry harvesting in [20].
Acknowledgement. This publication has emanated from research conducted with

the financial support of Science Foundation Ireland (SFI) under Grant Number
SFI/12/RC/2289.
References
1. de Amorim, R., Fenner, T.: Weighting Features for Partition Around Medoids
Using the Minkowski Metric, pp. 35–44. Springer, Heidelberg (2012)
2. Angelo, J., Barbosa, H.: Differential evolution to find Stackelberg-Nash equilibrium
in bilevel problems with multiple followers. In: IEEE Congress on Evolutionary
Computation, CEC 2015, Sendai, Japan, May 25–28, 2015, pp. 1675–1682 (2015)
3. Bard, J.: Convex two-level optimization. Math. Program. 40(1), 15–27 (1988)
4. Calvete, H., Galé, C.: Linear bilevel multi-follower programming with independent
followers. J. Glob. Optim. 39(3), 409–417 (2007)
5. Colson, B., Marcotte, P., Savard, G.: An overview of bilevel optimization. Ann.
Oper. Res. 153(1), 235–256 (2007)
6. Deb, K.: An efficient constraint handling method for genetic algorithms. Comput.
Methods Appl. Mech. Eng. 186(2–4), 311–338 (2000)
7. DeMiguel, V., Xu, H.: A stochastic multiple-leader Stackelberg model: analysis,
computation, and application. Oper. Res. 57(5), 1220–1235 (2009)
8. IBM: User’s manual of IBM CPLEX optimizer for z/OS: what is an indicator
constraint? (2017). https://ibmco/2ErnDyn
9. Islam, M., Singh, H., Ray, T.: A memetic algorithm for solving bilevel optimization
problems with multiple followers. In: IEEE Congress on Evolutionary Computa-
tion, CEC 2016, Vancouver, BC, Canada, July 24–29, 2016, pp. 1901–1908 (2016)
10. Kaufman, L., Rousseeuw, P.: Finding Groups in Data: An Introduction to Cluster
Analysis, vol. 344. Wiley (2009)
11. Liu, B.: Stackelberg-Nash equilibrium for multilevel programming with multiple
followers using genetic algorithms. Comput. Math. Appl. 36(7), 79–89 (1998)
12. Lu, J., Han, J., Hu, Y., Zhang, G.: Multilevel decision-making: a survey. Inf. Sci.
346–347(Supplement C), 463 – 487 (2016). https://doi.org/10.1016/j.ins.2016.01.
084, http://www.sciencedirect.com/science/article/pii/S0020025516300202
13. Lu, J., Shi, C., Zhang, G.: On bilevel multi-follower decision making: general frame-
work and solutions. Inf. Sci. 176(11), 1607–1627 (2006)
14. Lu, J., Shi, C., Zhang, G., Dillon, T.: Model and extended Kuhn-Tucker approach
for bilevel multi-follower decision making in a referential-uncooperative situation.
J. Glob. Optim. 38(4), 597–608 (2007)
15. Lu, J., Shi, C., Zhang, G., Ruan, D.: Multi-follower linear bilevel programming:
model and Kuhn-Tucker approach. In: AC 2005, Proceedings of the IADIS Inter-
national Conference on Applied Computing, Algarve, Portugal, February 22–25,
2005, vol. 2, pp. 81–88 (2005)
16. Lu, J., Shi, C., Zhang, G., Ruan, D.: An extended branch and bound algorithm
for bilevel multi-follower decision making in a referential-uncooperative situation.
Int. J. Inf. Technol. Decis. Mak. 6(2), 371–388 (2007)
17. Maechler, M., Rousseeuw, P., Struyf, A., Hubert, M., Hornik, K.: cluster: Cluster
Analysis Basics and Extensions (2017). R package version 2.0.6—for new features,
see the ‘Changelog’ file (in the package source)
18. Marsaglia, G.: Choosing a point from the surface of a sphere. Ann. Math. Statist.
43(2), 645–646 (1972). https://doi.org/10.1214/aoms/1177692644
19. Muller, M.: A note on a method for generating points uniformly on n-dimensional
spheres. Commun. ACM 2(4), 19–20 (1959)
20. Prestwich, S., Fajemisin, A., Climent, L., O’Sullivan, B.: Solving a Hard Cut-
ting Stock Problem by Machine Learning and Optimisation, pp. 335–347. Springer
International Publishing, Cham (2015)
21. Ramos, M., Boix, M., Aussel, D., Montastruc, L., Domenech, S.: Water
integration in eco-industrial parks using a multi-leader-follower approach.
Comput. Chem. Eng. 87(Supplement C), 190–207 (2016).https://doi.org/10.
1016/j.compchemeng.2016.01.005, http://www.sciencedirect.com/science/article/
pii/S0098135416000089
22. Shi, C., Lu, J., Zhang, G., Zhou, H.: An extended Kuhn-Tucker approach for linear
bilevel multifollower programming with partial shared variables among followers.
In: Proceedings of the IEEE International Conference on Systems, Man and Cyber-
netics, Waikoloa, Hawaii, USA, October 10–12, 2005, pp. 3350–3357 (2005)
23. Shi, C., Zhang, G., Lu, J.: The Kth-best approach for linear bilevel multi-follower
24. Shi, C., Zhou, H., Lu, J., Zhang, G., Zhang, Z.: The Kth-best approach for linear
bilevel multifollower programming with partial shared variables among followers.
Appl. Math. Comput. 188(2), 1686–1698 (2007)
25. Sinha, A., Malo, P., Frantsev, A., Deb, K.: Finding optimal strategies in a multi-
period multi-leader-follower Stackelberg game using an evolutionary algorithm.
Comput. Oper. Res. 41, 374–385 (2014)
26. Wei, C.P., Lee, Y.H., Hsu, C.M.: Empirical comparison of fast clustering algo-
rithms for large data sets. In: Proceedings of the 33rd Annual Hawaii International
Conference on System Sciences, pp. 10-pp. IEEE (2000)
27. Zhang, G., Lu, J.: Fuzzy bilevel programming with multiple objectives and coop-
erative multiple followers. J. Glob. Optim. 47(3), 403–419 (2010)
28. Zhang, G., Shi, C., Lu, J.: An extended Kth-best approach for referential-
uncooperative bilevel multi-follower decision making. Int. J. Comput. Intell. Syst.
1(3), 205–214 (2008)
KMCGO: Kriging-Assisted Multi-objective
Constrained Global Optimization
Yaohui Li1, Yizhong Wu2, Yuanmin Zhang1(&), and Shuting Wang2

1
Xuchang University, 461000 Xuchang, China
zhangyuanmin2006@163.com
2
Huazhong University of Science and Technology, 430074 Wuhan, China
Abstract. The Kriging method based single-objective optimization have been

preventing the application of some engineering design problems. The main
challenge is how to explore a method that can improve convergence accuracy
and reduce time cost under the conditions of parallel simulation estimation. For
this purpose, a Kriging-assisted multi-objective constrained global optimization
(KMCGO) algorithm is proposed. In KMCGO, Kriging models of expensive
objective and constraint functions are firstly constructed or updated with the
sampled data. And then, the objective, root mean square error and feasibility
probability, which will be predicted by Kriging models, are used to construct
three optimization objectives. After optimizing the three objectives by the
NSGA-II solver, the new sampling points produced by the Pareto optimal
solutions will be further screened to obtain better design points. Finally, four
numerical tests and a design problem are checked to illustrate the feasibility,
stability and effectiveness of the proposed method.
Keywords: Constrained global optimization Surrogate models Kriging

Infill search criterion Multi-objective optimization
1 Introduction
The Kriging-based single-objective optimization methods. More researchers are more

attracted to Kriging assisted multi-objective optimization. The traditional MOGAs [1]
is based on nondominated sorting genetic algorithm combined with an elitism strategy,
but the responses of initial population are calculated by the high-fidelity models.
Luckily, the Kriging-based optimization can employ current predicted information
to construct or define useful optimization objectives, which can then be used to predict
new promising sampling points at smaller evaluation cost [2]. Li et al. [3] proposed a
new K-MOGA by embedding Kriging to enhance MOGAs. The key difference is that
K-MOGA are adaptively evaluated using Kriging instead of real function. Koziel et al.
[4, 5] respectively used Kriging and cheap co-Kriging to construct multi-objective
evolutionary optimization method applied to the antenna structure design.
Most of the above methods use the multi-objective evolutionary algorithms to
optimize the Kriging. To some extent, few methods can deeply explore potential
predictive functions and trend of the Kriging itself and make full use of these infor-
mation to reduce time cost. To this end, Kriging with EI and feasible probability is

https://doi.org/10.1007/978-3-030-21803-4_63
628 Y. Li et al.
applied to a constrained multi-objective optimization problem to balance the local and

global search [6]. Durantin and Marzat who used EI, probability of feasibility and
predictive RMSE proposed a new multi-objective constrained optimization approach
[7].
To further improve optimization performance, a Kriging-assisted multi-objective
constrained global optimization (KMCGO) algorithm is proposed. It mainly focuses on
solving the following constrained optimization problem:
min f ðxÞ a x b; x 2 <n ; s:t: gi ðxÞ 0; i ¼ 1; . . .; q ð1Þ
where f ; g1 ; . . .; gq are deterministic, continuous and expensive black-box functions;

the feasible interval is D ¼ fx 2 ½a; b; gi ðxÞ 0g, and q is the number of the con-
straint. Equality constraints in problem (1) can be specified as an upper-bounded or
lower-bounded inequality with the same constraint functions and bounds. It is assumed
there will be a more relaxed feasibility tolerance since expensive-simulation problems
should be disposed. The objective function and the inequality constraints are all con-
structed by Kriging.
In Sect. 2, background of Kriging is reviewed. Details of KMCGO are presented in
Sect. 3. Existing methods for constrained optimization are then compared on some
benchmarks and a design problem in Sect. 4. The conclusions are drawn in Sect. 5.
2 Kriging Model
For design data X ¼ ½x1 ; . . .; xm T , X 2 <mn , response Y ¼ ½y1 ; . . .; ym T , Y 2 <m1 ,

Kriging [8] of expressing an unknown function yðxÞ is
yðxÞ ¼ Fb þ ZðxÞ: ð2Þ
The weighting coefficient b of the regression functions F is a p-dimension vector.

Combined with correlation parameter h and process variance r2 , characteristic of the
random process ZðxÞ can be described by:
E ½ZðxÞ ¼ 0; E [ZðxÞZðxÞ ¼ r2 Rðh; x; xÞ: ð3Þ
As a spatial correlation function model, Rðh; x; xÞ is shown in Eq. (4).
Y
n
Rðh; x; xÞ ¼ Ri ðhi ; xi xi Þ: ð4Þ
i¼1
The matrix F (F 2 <mp ) and R (R 2 <mn ) are respectively composed of

regression function f ðxÞ and spatial correlation function Rðh; x; xÞ. The regression
problem Fb Y based on the unbiased estimator theory has a generalized least squares
^ ¼ ðFT R1 FÞ1 FT R1 Y and maximum likelihood estimation of variance
solution b
KMCGO: Kriging-Assisted Multi-objective 629
^2 ¼ ðY FbÞ
r ^ T R1 ðY FbÞ=m.
^ Therefore, Kriging predictor at any point can be
expressed as
^ þ rT ðxÞ^c
^yðxÞ ¼ Fb ð5Þ
^ And the predicted

where rT ðxÞ ¼ ½Rðh; x; x1 Þ; . . .; Rðh; x; xm Þ; c^ ¼ R1 ðY FbÞ:
RMSE [9] of Kriging at a unobserved point can be calculated by
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ffi
u ( " #)
p ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u T f ðx Þ
^sðx Þ ¼ MSE½Yðx Þ ¼ tr
0 F
^2 1 ½f ðx ÞT rðx ÞT : ð6Þ
F R rðx Þ
3 KMCGO Method
To further enhance the Kriging-based multi-objective optimization performance, three

issues are considered: (1) The Kriging objective prediction and the corresponding
variance are both considered to achieve a better trade-off; (2) It might lead to a failure
of constructing Kriging model in the case of very close infill sampling points due to the
Singularity of the covariance matrix R; (3) The new sampling points produced by the
Pareto optimal front set should be further screened to obtain more potential sampling
points. In view of the above analysis, we introduce a new infill sampling criterion with
three optimization objectives in the proposed KMCGO method to improve global
optimization performance of the black-box constraint problems.
3.1 The Construction of Three Optimization Objectives
The first optimization objective. Kriging is an interpolation model. It might result in

highly oscillating behavior of the prediction. When any two sampling points are very
close, an ill-conditioned or singular covariance matrix occurs, it might directly lead to
construction failure of Kriging. The nugget effect [10] in such cases might improve this
condition to a certain degree. To avoid the situation, the first optimization objective can
be described by Eq. (12).
min ^f ðxÞ d e ; s:t: a x b ð12Þ
where the ^f ðxÞ at point x is predicted by Kriging, the non-negative exponent e can be
adaptively selected from fe1 . . .; 0:001; 0:01; 0:1; 1; 10; . . .ek g due to the iteration
number in each cycle. Parameter d is the distance factor, which is shown as follows:
8
< 0
dmax dmin ^f ðxÞ [ 0
kbak ;
d¼ d dmin 0 ð13Þ
:1 maxkbak ; ^f ðxÞ 0
630 Y. Li et al.
In Eq. (13), parameter dmax ¼ maxfd : x 2 Xg is the maximum distance among the
sampled design data X ¼ ½x1 ; . . .; xm .
0
The dmin ¼ minðkx x1 k; . . .; kx xm kÞ is the shortest distance between the
untried point x and the set X ¼ ½x1 ; . . .; xm T .
The second optimization objective. In the actual optimization, the sampling points
which are precisely on the approximate constraint boundaries may be infeasible. But
the points which lies in the approximate feasible area and have a tiny deviation from
the constraint boundary are likely to be feasible in most instances. We define gmax ¼
max½g1 ðxÞ; . . .; gq ðxÞ as the maximum constraint violation. If gmax is less than or equal
to zero, we think that all constraints have been met. In KMCGO, we use Kriging to
approximate all constraint functions, the probability P [11] meeting all constraints can
be shown as
Pðgmax 0Þ ¼ 1 Uð^gmax =^smax Þ ð14Þ
where UðÞ is the normal cumulative distribution function, ^

gmax is the largest one of all
kriging constraints, ^smax can be calculated by Eq. 6. To ensure the feasibility of the
sampling point, we should maximize the feasible probability P, i.e.,
max P; s:t: a x b ð15Þ
The third optimization objective. A large estimated RMSE of the objective is

favorable to explore more promising regions and further increase the probability of
obtaining a global optimal. In addition, we should also pay attention to the estimated
RMSE of the constraint functions. If the global optimal point is lied in a certain area
and the predicted variance is high in this area, it may prevent the optimization process
from finding the optimal solution with sufficient confidence. Therefore, reducing the
estimated RMSE of the active constraint functions can also help us explore an
approximate global optimal value which is closer to the actual constraint boundaries.
Therefore, we define the third optimization objective by Eq. (16) due to the above
analysis.
" #
X
q
max ^sf ðxÞ ^sgi ðxÞ ð16Þ
i¼1
Finally, the three optimization objectives can be produced by Eq. (17).

( " #)
X
q
min ^f ðxÞ d ;
e
min ½P; min ^sgi ðxÞ ^sf ðxÞ axb ð17Þ
i¼1
3.2 Selection of New Expensive-Evaluation Points

The three objectives in Eq. (17) are optimized by NSGA-II solver to generate pareto
front. We assume data set X0 ¼ fx1 ; . . .; xj g is composed of Pareto front. And then, the
data X and X0 is standardized. The screen process is as following.
Step 1: For the candidate point xi in the Pareto optimal-frontier set, if
Pðgmax ðxi Þ 0Þ is greater and equal to the 99%, we think the sampling point is
feasible and accept it. The number v of such sampling points is greater and equal 4n,
jump to Step 3. This reason is explained as follows.
Firstly, feasibility is the premise and key to constrained optimization. Although the
express Pðgmax ðxi Þ 0Þ is one of the optimization objectives, multi-objective
optimization pays more attention to the reciprocity of these objectives. Therefore, it
is necessary to satisfy Pðgmax ðxi Þ 0Þ 99% here.
Step 2: If there is no such candidate, we will choose the w sampling points which
are closest to 99% as the further screening points. It is noted that the condition
w þ v ¼ 4n need to be ensured.
Step 3: To further ensure the successful construction of Kriging model, the distance
between the newly acquired expensive sampling points and the sampled data should
not be too close. The minimum distance dmin ¼ kxm xn k2 ; m 6¼ n is calculated
between any two points. We need also calculate the minimum distance
0
dmin ¼ kxi xk k2 , xk 2 X between xi ði 2 1; . . .; jÞ and all sampling points.
0
According to dmin and dmin , the objective-distance-improvement index d ¼
0 0
jym yn j=dmin and d ¼ j^yi yk j=dmin can be got. If d0 [ d, xi will be accepted, or
else, abandon xi .
Step 4: Calculate the function value f^f ðxÞ ^sf ðxÞg for every selected point. To
balance exploration and exploitation, we hope that a sampling point not only has a
smaller objective function value but also has a larger RMSE. Then we arrange the
sample from small to large and select the front n points as the final candidates.
3.3 Exploration on Promising Area

How to jump out of a local optimal region should be solved when a feasible sampling
point cannot be found after 2*n expensive evaluation. If the number k of current
feasible sampling points fx1 ; . . .; xk g is greater than 2*n, we think the optimization
objective has a large feasible space. Jumping out of current local optimal region may be
a wise choice. To this end, we use the point x ¼ ðx1 þ þ xk Þ=2 ¼ ðx ð1Þ; x ð2ÞÞ
as a center, and increase or decrease the value dmean ¼ ðdmax dmin Þ=2 for each
coordinate axis to generate 2n + 1 new points. For 2D problem, the new points are
respectively ðx ð1Þ; x ð2ÞÞ; ðx ð1Þ
dmean ; x ð2ÞÞ and ðx ð1Þ; x ð2Þ
dmean Þ: If
x ð1Þ dmean is lower than að1Þ; We will use að1Þ instead of x ð1Þ dmean . The same
way is true for the other three cases. And then a final sampling point xfinal with good
feasibility and small objective estimation value is chosen from the new sampling data.
Meanwhile, xfinal is expensively evaluated and added to the sample set fS; Yg, which
will be used to reconstruct Kriging and perform the next iterative optimization.
632 Y. Li et al.
3.4 The Specific Implementation Flows

Input and output of KMCGO method for problem (1) are shown in Table 1. The
flowchart of KMCGO algorithm is simply shown in Fig. 1. The specific implemen-
tation steps of KMCGO algorithm can be shown as follows.
Initialization
1. Parameter initialization of KMCGO: design domain, constraints,
parameters of Kriging model , convergence conditions and so on
2. Initial design of experiment: Use Latin hypercube sampling to generate initial

sampling points, and perform expensive function evaluations on these sampling points
3. Construct or update Kriging model by all the sampled design points
4. The predictive objective information, mean square error and probability of feasibility
calculated by Kriging models are used to construct three optimization objectives
Multi-objective 5. The three objectives are optimized to generate

optimal sampling pareto optimal solutions by the NSGA-II solver
6. The new expensive-evaluation points will be

further selected from the Pareto optimal set
7.Is a feasible sampling point found N 8. Exploration on

after 2*n expensive evaluations? promising area
Y 11.Terminate and output

9. Satisfy the convergence
global approximate
conditions?
optimal solution.
10. Perform expensive function evaluations for the selected sampling points
Fig. 1. Flowchart of the KMCGO algorithm.
Step 1. Parameter initialization.

Step 2. Initial design of experiment. Latin Hypercube Design [12] is used to get 2*
(n + 3) initial sampling data for n-dimensional problem. Then, the expensive
function f ðxÞ is calculated for each sampling point x to obtain sample data {S, Y}.
We divide {S, Y} into feasible data set {Sfea, Yfea} and infeasible data set {Sinfea,
Yinfea}, and initialize ybest ¼ f ðxbest Þ for {Sfea, Yfea}.
Step 3. Update of Kriging model. The latest sample data {S, Y} is used to update or
construct approximate Kriging models of objective and constraints.
Step 4. Build three optimization objectives. Three optimization objectives are built
due to Eq. (17).
Step 5. Multi-objective optimization. The three objectives are optimized to generate
pareto optimal solutions by the NSGA-II solver.
Step 6. Selection of new expensive-evaluation points. The specific process is shown
in Sect. 3.2.
Step 7. Determine if a feasible point is found. If there are a or some feasible points
found after 2*n + 1 expensive evaluation, jump to Step 9, otherwise, perfume Step
8.
Step 8. Exploration on promising area. The specific process is shown in Sect. 3.3.
Table 1. Input and output of KMCGO method.

Input Deterministic black-box function f ðxÞ and constrain g1 ðxÞ; . . .; gq ðxÞ;
parameters a x b; x 2 <n .
Initial sample X ¼ ½x1 ; . . .; xm T 2 ½a;b: A Kriging model with Gaussian
correlation function. e ¼ f0:0001; 0:001; 0:01; 0:1; 1; 10; 100g.
Maximum expensive evaluation number Nmax ¼ 10 ð2n þ 1Þ:
Output The approximate optimal point ðxbest ; ybest Þ obtained by KMCGO.
Step 9. Stop criterion. Judge whether the convergence conditions are met, if met,
continue with the Step 10, or else, jump to Step 11.
Step 10. Expensive function evaluation of new data. Perform expensive function
evaluations for the selected new sampling points.
Step 11. End. End the process and output global approximate solution (xbest, ybest).
4 Test
To verify the performance of the KMCGO method, a series of tests, which include 7
benchmark numerical functions [13]. (G4, G7, G8 and G9), a speed reducer design
problem [14] have been executed. This section gives the test results of KMCGO and
makes comparisons with the results of SCGOSR [15], TOKCGO [7] and KCGO [11].
4.1 Numerical Test

Table 2 gives the basic or iteration information of benchmark used in these tests.
To illustrate the characteristics of the proposed method, the final sampling results
and pareto front of the G8 function are shown in Fig. 2. Seeing from this figure, we can
draw the following conclusions: (1) In the iterative optimization process after initial
sampling, the second one of the three optimization objective makes the feasible
Table 2. Description of functions. The abbreviations for test problem, number of design
variables, number of constraints, the known optimum, the given relative error (GRE) and
maximum expensive evaluation number are TP, NODV, NOC, TKO, GRE and MEEN
respectively.
TP NODV NOC Bound Constraint TKO GRE MEEN
G4 5 6 [78, 102] [33, 45] [27, 45]3 −30665.539 1e−5 50
G7 10 8 [−10, 10]10 24.30621 0.01 150
G8 2 2 [0, 10]2 −0.095825 1e−4 50
G9 7 4 [−10, 10]7 680.630057 0.5 150
634 Y. Li et al.
sampling points or infeasible sampling points gather near the constraint boundary,
which will greatly improve the feasibility of new sampling points; (2) The remaining
two of the three optimization objectives and screening method can make new sampling
point be closer to the real optimal solution in the feasible domain
Furthermore, Figs. 3, 4, 5, 6 and 7 provides the iterative results of KMCGO. For
low dimensional problems (G4 and G8), LHD can find a feasible point in most cases,
but the high dimensional problems usually explore a feasible point in iteration opti-
mization. The optimization result of G4 has a slightly less convergence, while G7, G8,
Fig. 2. Infill Sampling of KMCGO on G8. Fig. 3. Iteration process of G4 function.
Fig. 5. Iteration process of G8 function.

Fig. 4. Iteration process of G7 function.
Fig. 6. Iteration process of G9 function. Fig. 7. Iteration process of SRD problem.
and G9 have better convergence. It is closely related to the expression of constraints

and target problems. Therefore, KMCGO is suitable for constrained optimization.
4.2 Speed Reducer Design Problem

For a speed reducer structure [14], the iteration process is shown in Fig. 7. For initial
design, we did not directly get a feasible points through LHD. They need to be got
during the iterative optimization process of KMCGO method. The better global optimal
solution that meets the given requirements can be found by a certain number of
evaluations. And test result of SRD problem has good astringency, stability and
effectiveness. To sum with, optimization effect of KMCGO is satisfied in most cases.
4.3 Comparison
The comparisons are shown in Table 3. Three conclusions may be drawn: (1) LHD
usually fails to obtain feasible points for most cases; (2) For the test function G7, G8
and G9 with larger feasible exploration regions, it necessary to perform more function
evaluations (especially G7 and G9) in many cases. Even so, the approximate optimal
solution found in some cases is not very good; (3) Comparison results shows that the
KMCGO has a better convergence character in contrast with other three methods.
Table 3. Comparison results. MEEN, AOA, DTM and RRE represent for mean expensive
evaluation number, approximate optimum area, distance to minimizer and the real relative error.
TF Dim Method MEEN AOA DTM RRE
G4 5 KMCGO 44.6 −30665.472 ± 0.052 [0.015, 0.119] [4.9e−7, 3.9e−6]
SCRGOSR 53.9 −30665.463 ± 0.064 [0.012, 0.140] [3.9e−7, 4.6e−6]
TOKCGO 46.5 −30665.475 ± 0.043 [0.021, 0.127] [6.8e−7, 4.1e−6]
KCGO 32.7 −30665.480 ± 0.035 [0.024, 0.094] [7.8e−7, 3.1e−6]
(continued)
636 Y. Li et al.
Table 3. (continued)
TF Dim Method MEEN AOA DTM RRE
G7 10 KMCGO 124.8 24.5046 ± 0.192 [0.00639, [2.63e−4, 8.16e
0.1984] −3]
SCRGOSR 178.2 24.6559 ± 0.314 [0.00869, [3.58e−4,
0.69069] 0.02842]
TOKCGO 130.1 24.5878 ± 0.266 [0.01559, [6.41e−4,
0.54759] 0.02253]
KCGO 136.5 24.3139 ± 0.046 [0.00309, [1.27e−4, 5.10e
0.01239] −4]
G8 2 KMCGO 46.2 −0.0958 ± 0.000024 [1e−6, [1.0e−5, 5.1e−4]
0.000049]
SCRGOSR 51.8 −0.0958 ± 0.000013 [12e−6, [1.2e−5, 4.0e−4]
0.000038]
TOKCGO 45.9 −0.0958 ± 0.000021 [4e−6, [4.2e−5, 4.8e−4]
0.000046]
KCGO 47.4 −0.0958 ± 0.000020 [5e−6, [5.2e−5, 4.7e−4]
0.000045]
G9 7 KMCGO 112.8 827.678 ± 85.57 [61.48, 255.15] [0.0903, 0.3418]
SCRGOSR 115.6 904.08 ± 77.78 [140.42, 294.98] [0.2044, 0.4294]
TOKCGO 124.4 839.195 ± 96.59 [61.98, 255.15] [0.0916, 0.3749]
KCGO 165.9 910.49 ± 67.96 [155.64, 291.56] [0.2266, 0.4245]
SRD 7 KMCGO 77.3 2995.36 ± 0.95 [0.01, 1.89] [3.34e−6, 6.31e
−4]
SCRGOSR 88.1 2996.15 ± 1.65 [0.08, 3.38] [2.68e−5, 1.13e
−3]
TOKCGO 79.6 2996.25 ± 1.07 [0.76, 2.90] [2.54e−4, 9.68e
−4]
KCGO 136.5 24.3139 ± 0.046 [0.00309, [1.27e−4, 5.10e
0.01239] −4]
5 Conclusions
In this work, KMCGO method is proposed to solve black-box constrained optimization

problems. We mainly use estimated objective information, predictive mean square error
and probability of feasibility calculated by Kriging models to create three optimization
objectives, which are optimized by NSGA-II solver to produce the Pareto optimal front
set. Further, sampling points of the Pareto optimal front set will be screened to get the
better design points. Four numerical tests and a design problem are checked to deliver
the feature of KMCGO. The future works are how to apply it to high-dimension design
optimization problems with transformation of Kriging model.
Acknowledgements. This work is supported by the National Natural Science Foundation of

China (No. 51775472, No. 51675197, No. 51575205).
References
1. Deb, K.: Multi-objective Optimization Using Evolutionary Algorithms. Wiley, New Jersey
(2001)
2. Knowles, J.: ParEGO: a hybrid algorithm with on-line landscape approximation for
expensive multiobjective optimization problems. IEEE Trans. Evol. Comput. 10(1), 50–66
(2006)
3. Li, M., Li, G., Azarm, S.: A Kriging metamodel assisted multi-objective genetic algorithm
for design optimization. J. Mech. Des. 130(3), 031401-031401-10 (2008)
4. Koziel, S., Ogurtsov, S.: Multi-objective design of antennas using variable-fidelity
simulations and surrogate models. IEEE Trans. Antennas Propag. 61(12), 5931–5939 (2013)
5. Koziel, S., Bekasiewicz, A., Couckuyt, I., Dhaene, T.: Efficient multi-objective simulation-
driven antenna design using co-Kriging. IEEE Trans. Antennas Propag. 62(11), 5900–5905
(2014)
6. Jeong, S., Yamamoto, K., Obayashi, S.: Kriging-based probabilistic method for constrained
multi-objective optimization problem. In: AIAA 1st Intelligent Systems Technical Confer-
ence, pp. 1–12 (2004)
7. Durantin, C., Marzat, J., Balesdent, M.: Analysis of multi-objective Kriging-based methods
for constrained global optimization. Comput. Optim. Appl. 63(3), 903–926 (2016)
8. Martin, J.D.: Computational improvements to estimating Kriging metamodel parameters.
J. Mech. Des. 131(8), 084501-084501-7 (2009)
9. Kleijnen, J.P.: Regression and Kriging metamodels with their experimental designs in
simulation: a review. Eur. J. Oper. Res. 256(1), 1–16 (2017)
10. Ranjan, P., Haynes, R., Karsten, R.: A computationally stable approach to gaussian process
interpolation of deterministic computer simulation data. Technometrics 53(4), 366–378
(2011)
11. Li, Y., Wu, Y., Zhao, J., Chen, L.: A Kriging-based constrained global optimization
algorithm for expensive black-box functions with infeasible initial points. J. Global Optim.
67(1–2), 343–366 (2017)
12. Tang, B.: Latin Hypercube Designs. Encyclopedia of Statistics in Quality and Reliability.
Wiley, New Jersey (2008)
13. Mezura-Montes, E., Cetina-Domínguez, O.: Empirical analysis of a modified artificial bee
colony for constrained numerical optimization. Appl. Math. Comput. 218(22), 10943–10973
(2012)
14. Azarm, S., Li, W.C.: Multi-level design optimization using global monotonicity analysis.
J. Mech. Transm. Autom. Des. 111(2), 259–263 (1989)
15. Dong, H., Song, B., Dong, Z., Wang, P.: SCGOSR: surrogate-based constrained global
optimization using space reduction. Appl. Soft Comput. 65, 462–477 (2018)
Multistage Global Search Using Various
Scalarization Schemes in Multicriteria
Optimization Problems
Victor Gergel(B) and Evgeniy Kozinov
Lobachevsky State University of Nizhni Novgorod, Nizhni Novgorod, Russia

gergel@unn.ru, evgeny.kozinov@itmm.unn.ru
Abstract. In this paper, an approach, in which the decision making

problems are reduced to solving the multicriteria time-consuming global
optimization problems is proposed. The developed approach includes var-
ious methods of scalarization of the vector criteria, the dimensionality
reduction with the use of the Peano space-filling curves and the efficient
global search algorithms. In the course of computations, the optimization
problem statements and the applied methods of the criteria scalarization
can be altered in order to achieve more complete compliance to available
requirements to the optimality. The overcoming of the computational
complexity of the developed approach is provided by means of the reuse
of the whole search information obtained in the course of computations.
The performed numerical experiments have confirmed the reuse of the
search information to allow reducing essentially the amount of computa-
tions for solving the global optimization problems.
Keywords: Decision making · Multicriteria optimization · Criteria

scalarization · Global optimization with nonlinear constraints ·
Numerical experiment
1 Introduction
The multicriteria optimization (MCO) problems, which are used as the state-
ments of the decision making problems often are the field of extensive research
– see, for example, the monographs [1–3] and the reviews of the scientific and
applied results in this area [4,5].
Usually, the finding of the efficient (non-dominated) decisions, in which the
improvement of the values with respect to any criteria cannot be achieved with-
out the worsening of the values with respect to other criteria is understood as
the solution of a MCO problem. In the most general case, when solving the
MCO problems, it can appear to be necessary to obtain a complete set of the
This research was supported by the Russian Science Foundation, project No 16-11-
10150 “Novel efficient methods and software tools for time-consuming decision making
problems using supercomputers of superior performance.”
https://doi.org/10.1007/978-3-030-21803-4_64
Multistage Global Search in Multicriteria Optimization Problems 639
efficient decisions (the Pareto set). However, the finding of all efficient decisions
may require a considerable amount of computations, and the set of obtained deci-
sions may appear to be quite large. As a result, the approaches, which obtain the
more limited set of efficient decisions are applied wider. Among such approaches,
there are various kinds of the criteria convolutions, the lexicographic optimiza-
tion methods, the algorithms of searching the best approximation to the given
prototypes, etc. All methods mentioned above allow accounting for the specific
features of the MCO problems being solved and satisfy the requirements to the
optimality from the decision maker.
The present work is devoted to the solving of the MCO problems, which are
used for the description of the complex decision making problems, in which the
criteria of efficiency may have a multiextremal form, and the determining of the
values of the criteria and constraints may require a large amount of computa-
tions. Also, it was assumed that in the course of computations it is possible to
change the problem statement, the methods, and the parameters of solving the
MCO problem that results in the necessity of the multiple solving of the global
optimization problems.
The practical use of this approach implies the overcoming of a considerable
computational complexity of the decision making problems that can be pro-
vided by means of the use of highly efficient global optimization methods and
the complete utilization of the search information obtained in the course of com-
putations.
In the present paper, the results of investigations on the generalization of
the decision making problem statements [6,7] and on the development of the
highly efficient global optimization methods utilizing the whole search informa-
tion obtained in the course of computations [8–10] are presented.
Further structure of the paper is as follows. In Sect. 2, the statement of the
decision making problems based on multistage multicriteria global optimiza-
tion are presented. In Sect. 3, a general scheme of the criteria scalarization
is proposed. In Sect. 4, the search information obtained in the course of com-
putations is considered. In Sect. 5, an efficient algorithm for solving the time-
consuming global optimization problems with the nonlinear constraints is pre-
sented. Section 6 presents the results of numerical experiments confirming the
developed approach to be promising. In Conclusion, the obtained results are
discussed and possible main directions of further research are outlined.
2 Multistage Multicriteria Optimization Problem

Statement
For the formal description of the process of solving the complex decision making
problems, the following generalized two-phase scheme is proposed.
1. In the most general form, a decision making problem is defined by means of
a vector function of characteristics
w(y) = (w1 (y), w2 (y), . . . , wM (y)), y ∈ D (1)

640 V. Gergel and E. Kozinov
where y = (y1 , y2 , . . . , yN ) is the vector of varied parameters and D ⊂ RN is

the search domain, which is usually an N -dimensional hypercube
D = {y ∈ RN : ai ≤ yi ≤ bi , 1 ≤ i ≤ N } (2)
for given vectors a and b.
It is supposed, that the values of characteristics w(y) are non-negative, and

the decreasing of these ones corresponds to the increasing of the efficiency
of the chosen decisions. Also, it is supposed that the characteristics wj (y),
1 ≤ j ≤ M , may be multiextremal, and the determining of their values may
require large enough amount of computations. Besides, the characteristics
wj (y), 1 ≤ j ≤ M , are supposed to satisfy the Lipschitz condition
|wj (y1 ) − wj (y2 )| ≤ Lj y1 − y2 , 1 ≤ j ≤ M (3)
where Lj is the Lipschitz constant for the characteristic wj (y) , 1 ≤ j ≤ M ,

and ∗ denotes the Euclidean norm in RN .
2. Then, a MCO problem is formulated on the basis of the statement considered
above. For this purpose, a vector criterion of efficiency is selected among the
characteristics wj (y), 1 ≤ j ≤ M , from (1)
f (y) = (f1 (y), f2 (y), . . . , fs (y)), fj (y) = wij (y), 1 ≤ j ≤ s, 1 ≤ ij ≤ M (4)
and the vector function of constraints
g(y) = (g1 (y), g2 (y), . . . , gm (y)), gl (y) = wil (y) − ql , 1 ≤ l ≤ m, 1 ≤ il ≤ M (5)
where ql > 0, 1 ≤ l ≤ m, are the allowances on the feasible values of

characteristics wj (y), 1 ≤ j ≤ M .
The formulated criteria of efficiency and constraints allow to define a multi-
criteria optimization problem
P : f (y) → min, y ∈ Q (6)
where Q is the feasible search domain
Q = {y ∈ D : g(y) ≤ 0}. (7)
In development of this scheme of the MCO problem statement, further an

opportunity of simultaneous formulation of several MCO problems
Pt = {P1 , P1 , . . . , Pt }, (8)
will be allowed, the set of which can be varied in the course of computations by
means of adding new or by removing already existing optimization problems.
In general, the proposed scheme of the optimal decision search process (1)–(8)
defines a new class of the optimization problems – the multistage multicriteria
global optimization (MMGO) problems.
3 Reduction of the Multistage Multicriteria Search to the

Scalar One-Dimensional Global Optimization Problems
One of the widely used approaches to solving the MCO problems is to transform
the vector criterion into some general scalar criterion of efficiency that allows
using a large set of already existing optimization methods for solving the global
optimization problems. Among the possible scalarization methods, there are, for
example, the weighted sum method, the compromise programming method, the
weighted minimax method, and many other methods – see, for example, [2,3].
In the general form, the statement of the global optimization problems gen-
erated as a result the MCO problem criteria scalarization can be represented as
follows:
min ϕ(y) = F (α, y), g(y) ≤ 0, y ∈ D, (9)
where F is the scalar objective function, α is the vector of parameters of the
applied criteria scalarization method, g(y) are the constraints of the MCO prob-
lem from (6), and D is the search domain from (2).
Particular forms of the function F (α, y) are defined by the criteria scalar-
ization methods applied. For example, the following scalarization methods are
possible.
1. In the case of equal importance of the criteria fi , 1 ≤ i ≤ s, the minimax
convolution scheme (MMC) [3] can be applied:
F 1 (λ, y) = max (λi fi (y), 1 ≤ i ≤ s),

s
(10)
λ = (λ1 , λ2 , . . . , λs ) ∈ Λ ⊂ Rs : λi = 1, λi ≥ 0, 1 ≤ i ≤ s.
i=1
2. In the case of arrangement of the criteria in importance order, the method

of successive concessions (MSC) [2,3] can be used. In this case, the scalar
objective function F can be set as follows
min F 2 (λ, y) = fs (y), fi (y) ≤ fimin + δi (fimax − fimin ), 1 ≤ i < s, g(y) ≤ 0, y ∈ D, (11)
where fimin and fimax are the minimum and maximum values of the criteria
fi (y), 1 ≤ i < s, in the domain D respectively, and 0 ≤ δi ≤ 1, 1 ≤ i < s,
are the concessions with respect to each criterion. As before, the values of
concessions 0 ≤ δi ≤ 1, 1 ≤ i < s, can be varied in the course of computations.
The quantities fimin and fimax , 1 ≤ i < s, the values of which may be unknown
a priori, can be replaced by the minimum and maximum estimates of the
criteria values computed using the available search information.
3. In the case of availability of any estimates of the criteria values of the required
decision (for example, based on an ideal decision or on any existing prototype)
the MCO problem solution may consist in finding an efficient decision the
most completely matching given values of optimality (the reference point

method, RPM). Such a problem can be formulated in the form of a scalar
optimization problem
s

min F 3 (λ, y) = 1/s θi (fi (y) − fi∗ )2 , g(y) ≤ 0, y ∈ D (12)
i=1
where the objective function F 3 (λ, y) is the standard deviation of the decision
y ∈ D from the sought ideal decision, and the quantities 0 ≤ θi ≤ 1, 1 ≤ i < s,
are the magnitudes of importance of approximations with respect to each
particular variable yi , 1 ≤ i ≤ N .
Within the framework of the developed approach, it is possible to change the
used scalarization methods (10)–(12) and/or altering the parameters of convo-
lutions λ, δ and θ. Such variations expand the set of the MCO problems P from
(8) necessary for solving the initial decision making problem into a wider set of
the scalar global optimization problems (9)
FT = {Fi (αi , y) : 1 ≤ i ≤ T }, (13)
in which each problem P ∈ P from (8) can correspond to several global opti-
mization problems (9).
In the developed approach, one more step of converting the problems being
solved F (λ, y) from (9) is applied, namely the dimensionality reduction is per-
formed with the use of the Peano space-filling curves (evolvents) y(x) providing
an unambiguous mapping of the interval [0, 1] onto an N -dimensional hypercube
D [11,12]. As a result of such reduction, the multidimensional global optimiza-
tion problem (9) is reduced to a one-dimensional problem
F (α, y(x∗ )) = min {F (α, y(x)) : g(y(x)) ≤ 0, x ∈ [0, 1]}. (14)
The dimensionality reduction allows applying many well known highly effi-
cient one-dimensional global optimization algorithms for solving the problems
(9) (after performing necessary generalization) – see, for example, [11–15].
4 Computational Complexity Reduction of the

Multistage Multicriteria Search on the Basis of the
Reuse of the Search Information
The numerical solving of the global optimization problems (9) is usually imple-
mented by the successive computing the values of characteristics w(y) at the
points y i , 0 ≤ i ≤ k, of the search domain D [11,14]. The data obtained as a
result of computations can be represented in the form of the matrix of the search
information
Ωk = {(y i , wi = w(y i ))T : 1 ≤ i ≤ k}. (15)
As a result of scalarization (9) and the use of the dimensionality reduction

(14), the set Ωk from (15) can be transformed into the form of the matrix of the
search state
Ak = (xi , zi , gi , li )T : 1 ≤ i ≤ k, (16)
where xi , 1 ≤ i ≤ k, are the reduced points of the executed global search
iterations, zi , gi , 1 ≤ i ≤ k, are the values of the scalar criterion and constraints
of current optimization problem F (α, y(x)) at these points, and li , 1 ≤ i ≤ k,
are the indices of global search iterations, in which the points xi , 1 ≤ i ≤ k, were
computed.
The availability of the set Ωk from (15) allows reducing the results of all
preceding computations zi , 1 ≤ i ≤ k from the matrix Ak to the values of
the next optimization problem F (α, y(x)) from (9) without any repeated time-
consuming computations of the values of w(y) from (1), i. e.
α,P
wi −−→ (zi , gi ), 1 ≤ i ≤ k, ∀α, P ∈ P (17)
The reuse of the search information can provide a gradual decreasing of the
amount of computations when solving every next optimization problem down to
the execution of few iterations only to find the next efficient decision.
5 Efficient Solving the Multistage Multicriteria

Optimization Problems with Nonlinear Constraints
Within the framework of developed approach, the efficient method was applied
for solving the global optimization problems with the nonlinear constraints [11].
The essence of the approach is the constructing of a problem with some aggre-
gated unconstrained objective function, the solving of which leads to the solution
of the initial problem (9) – more detailed description of the approach will be given
below.
Let us introduced a simpler notation for the reduced problems (14) as
min {gm+1 (x) : gi (y(x)) ≤ 0, 1 ≤ i ≤ m, x ∈ [0, 1]}, gm+1 (x) = F (λ, y(x)). (18)
The problem (18) can be considered in the statement of partial computability
when each function gj (y), 1 ≤ j ≤ m + 1 is defined and computable in certain
subinterval Δj ⊂ [0, 1] only where
Δ1 = [0, 1], Δj+1 = {x ∈ Δj : gj (y(x)) ≤ 0}, 1 ≤ j ≤ m. (19)
Taking into account the conditions (19), the objective function of the problem
(18) can be represented in the form
ϕ(x∗ ) = min {gm+1 (y(x)) : x ∈ Δm+1 }, (20)
on the basis of which the problem (18) can be transform to the problem of
minimizing the aggregated function
min {Φ(y(x)) = gv (y(x)), v = v(x), x ∈ [0, 1]},
(21)
1 ≤ v = v(x) ≤ m + 1, gv (y(x)) > 0, gj (y(x)) ≤ 0, 1 ≤ j ≤ v − 1.
The index v = v(x), 1 ≤ v ≤ m + 1 defines the first violated constraint at

the point x in successive calculation of constraints.
Within the framework of the developed approach, the algorithm of global
constrained optimization (AGCO), which is considered in details in [11] is applied
for solving the global unconstrained problems (21).
6 Results of Numerical Experiments

The numerical experiments have been carried out using the computational nodes
of Lobachevsky supercomputer at Nizhni Novgorod State University. The peak
performance of the supercomputer was 573 Tflops, each computational node was
equipped with Intel Sandy Bridge E5-2660 processor 2.2 GHz, 64 Gb RAM.
Within the framework of the present paper, the construction of a numerical
approximation (PDA) of the Pareto domain (PD) was understood as a solution
of a MCO problem. In order to evaluate the efficiency of constructing the PDA,
two main indicators applied widely were used: the completeness of coverage of the
Pareto domain (hypervolume index, HV) and the uniformity of distribution of
the numerical estimates of the efficient decisions (distribution uniformity index,
DU) [10]. The higher values of the index HV and the lower values of the index
DU corresponds to the better approximation.
Within the executed experiments, the two-dimensional bi-criterial MCO
problems were used, the criteria of which were defined with the use of the fam-
ily of multiextremal functions [11]. The functions of this family are defined as
follows
f (y1 , y2 ) = −(AB + AC)1/2 ,
7
7
AB = ( [Aij aij (y1 , y2 ) + Bij bij (y1 , y2 )])2 ,
i=1 j=1 (22)

7
7
CD = ( [Cij aij (y1 , y2 ) − Dij bij (y1 , y2 )])2 ,
i=1 j=1
where the expressions
aij (y1 , y2 ) = sin(πiy1 )sin(πjy2 ), bij (y1 , y2 ) = cos(πiy1 )cos(πjy2 )
are defined in the area 0 ≤ y1 , y2 ≤ 1, and the parameters −1 ≤

Aij , Bij , Cij , Dij ≤ 1 are the independent random numbers distributed uniformly
in the same interval.
The initial series of the numerical experiments was performed to evaluate
the positive effect (the reducing of the number of the executed iterations) due to
the reuse of the search information obtained in the course of computations (see
Sect. 4). When conducting the experiments, the AGCO algorithm was applied
sequentially with all scalarization methods considered in Sect. 3: the minimax
convolution of the criteria (10), the method of successive concessions (11), and
the reference point method (12). For each method mentioned above, 50 sub-
problems with various values of the parameters λ, δ and θ correspondingly have
been solved. In order to evaluate the indicator HV of the completeness of the

PDA approximation, the reference point (−4, −4) was used. For the AGCO algo-
rithms, the reliability parameter r = 2.3 was used; the accuracy in the stopping
condition was set as ε = 0.01.
The results of performed experiments are presented in Table 1. The column
“Iters” contains the total number of iterations (the number of computations of
the criteria values) required for solving the problem (22). The column “PDA”
shows the number of the obtained efficient decisions in the PDA. In the first part
of the table (columns 2–5), the results of solving all subproblems without the use
of the search information are shown. In the second part (columns 6–9) the results
of solving the same subproblems but with the use of the whole search information
obtained in the course of computations (i. e. when solving every next subproblem,
the search information obtained when solving all preceding subproblems was
utilized) are presented. In the column “S”, the speedup obtained as a result of
reducing the number of the global search iterations (the number of the computed
criteria values) required to solve all 50 subproblems due to the reuse of the search
information is shown.
The results of experiments presented in Table 1 demonstrate the reduction of
the number of the executed iterations of the AGCO algorithm more than 9 times.
At that, the indicator HV takes almost the same values independently on the
usage of the search information. However, the indicator DU was better notably
(except the method of successive concessions) at the computations without the
use of the search information (similar remark can be expressed on the number
of efficient decisions in the PDA as well) – most likely, this effect is manifested
due to essentially less number of the executed iterations when using the search
information. One can note also that the reference point method has some advan-
tage in the efficiency indicators (the number of points in PDA, indicators HV
and DU). However, this method inferior the use of the minimax convolutions in
the number of iterations required to solve all 50 subproblems essentially.
Table 1. Efficiency of the reuse of the search information in solving the problem (22)
using various criteria scalarization methods
Scalarization method Search information S

Not used Used
Iters PDA HV DU Iters PDA HV DU
MMC (10) 9124 152 53.4 0.64 922 44 52.7 0.91 9.9
MSC (11) 12802 88 52.8 1.16 1402 40 52.4 1.09 9.1
RPM (12) 12881 240 53.6 0.51 1323 56 53.0 0.83 9.7
In the last experiment, solving a test MCO problem has been conducted with
a use of the search information but the applied criteria scalarization methods
varied in the course of computations. In this experiment, at the first stage of
computations, the minimax criteria convolution was used in solving three sub-
problems (10) with the convolution coefficients (1, 0), (0.5, 0.5), and (0, 1),
correspondingly. At the second stage, the scalarization method was changed to
the reference point method (12), where the estimate (−14, −7) was used as the
reference point with the weighting coefficients (0.5, 0.5). And finally, at the third
stage, the method of successive concessions (11) was applied with the concession
with respect to the first criterion δ = 0.5.
Table 2. Results of solving the problem (22) with altering the criteria scalarization
methods in the course of computations
Stage of computations, criteria scalarization method Total iters New iters PDA
1. MMC (10), three subproblems 304 304 14
2. RPM (12) 402 98 19
3. MSC (11) 437 35 24
The results of performed experiments are given in Table 2. The column “Total
iters” contains the total number of executed global search iterations whereas the
column “New iters” shows the points of iterations executed at particular stage
of solving the problem only. As follows from the results presented in Table 2,
the number of the global search iterations executed at separate stages of solving
the MCO is continuously problem reduced (from 304 down to 35), and the reuse
of the search information provides the computationally efficient opportunity for
dynamic variations of the criteria scalarization methods applied in the course of
the MCO problem solution.
The PDA approximations of Pareto set obtained at the sequentially executed
stages are shown in Fig. 1.
Fig. 1. Approximations of the Pareto set PDA obtained at the sequentially executed
stages of computations
7 Conclusion
An approach is proposed to transform the decision making problems to the
multi-stage multicriteria time-consuming global optimization problems. As a key
property of the proposed approach it is supposed that the optimization problem
statements and the applied methods of the criteria scalarization can be changed
in the course of computations. The computational complexity is reduced by
means of the reuse of the computed search information. The performed numerical
experiments have confirmed the developed approach is promising.
In further investigations it is intended to execute the numerical experiments
on solving the problems for a larger number of efficiency criteria and for larger
dimensionality. Parallel computations can be considered as well.
References
1. Parnell, G.S., Driscoll, P.J., Henderson, D.L. (eds.): Decision Making in Systems
Engineering and Management. Wiley, New Jersey (2008)
2. Collette, Y., Siarry, P.: Multiobjective Optimization: Principles and Case Studies.
Decision Engineering. Springer (2011)
3. Pardalos, P.M., Žilinskas, A., Žilinskas, J.: Non-Convex Multi-Objective Optimiza-
tion. Springer (2017)
4. Hillermeier, C., Jahn, J.: Multiobjective optimization: survey of methods and
industrial applications. Surv. Math. Ind. 11, 1–42 (2005)
5. Modorskii, V.Y., Gaynutdinova, D.F., Gergel, V.P., Barkalov, K.A.: Optimization
in design of scientific products for purposes of cavitation problems. In: AIP Confer-
ence Proceedings, vol. 1738, p. 400013 (2016). https://doi.org/10.1063/1.4952201
6. Strongin, R.G., Gergel, V.P.: Parallel Computing for Globally Optimal Decision
Making. Lecture Notes in Computer Science, vol. 2763, pp. 76–88 (2003)
7. Gergel, V.P., Kozinov, E.A.: Accelerating multicriterial optimization by the inten-
sive exploitation of accumulated search data. In: AIP Conference Proceedings, vol.
1776, p. 090003 (2016). https://doi.org/10.1063/1.4965367
8. Gergel, V.: An unified approach to use of coprocessors of various types for solving
global optimization problems. In: 2nd International Conference on Mathematics
and Computers in Sciences and in Industry (2015). https://doi.org/10.1109/MCSI.
2015.18
9. Barkalov, K., Gergel, V., Lebedev, I.: Solving global optimization problems on GPU
cluster. In: AIP Conference Proceedings, vol. 1738, p. 400006 (2016). https://doi.
org/10.1063/1.4952194
10. Gergel, V.P., Kozinov, E.A.: Efficient multicriterial optimization based on intensive
reuse of search information. J. Glob. Optim. 71(1), 73–90 (2018). https://doi.org/
10.1007/s10898-018-0624-3
11. Strongin, R., Sergeyev, Y.: Global Optimization with Non-Convex Constraints.
Sequential and Parallel Algorithms. Kluwer Academic Publishers, Dordrecht (2nd
edn 2013, 3rd edn 2014)
12. Sergeyev, Y.D., Strongin, R.G., Lera, D.: Introduction to Global Optimization
Exploiting Space-Filling Curves. Springer (2013)
13. Zhigljavsky, A., Žilinskas, A.: Stochastic Global Optimization. Springer, Berlin
(2008)
14. Locatelli, M., Schoen, F.: Global Optimization: Theory, Algorithms, and Applica-
tions. SIAM (2013)
15. Floudas, C.A., Pardalos, M.P.: Recent Advances in Global Optimization. Princeton
University Press (2016)
Necessary Optimality Condition
for Nonlinear Interval Vector
Programming Problem Under B-Arcwise
Connected Functions
Mohan Bir Subba(B) and Vinay Singh
National Institute of Technology Mizoram, Chaltlang, Aizawl 796012, Mizoram, India

mohanb.subba@gmail.com, vinaybhu1981@gmail.com
Abstract. We consider the nonlinear interval vector programming prob-

lem (NIVP) for solving uncertainty programming problems and intro-
duced the set of B-arcwise connected interval-valued function (BCIF) and
strictly BCIF (SBCIF) by generalizing the notion of arcwise connected
interval-valued function. Arcwise connected function is a generalization
of convex function which is defined on the arcwise connected set [2].
The differentiability of the function is studied by introducing right gen-
eralized Hukuhara derivative (gH-derivative or gH-differentiable). The
extremum conditions for the functions under right gH-derivative have
been derived. This is a new type of NIVP with right gH-differentiable
function in both multiple objective and constraints involving BCIFs. The
Fritz-John kind and Karush-Kuhn-Tucker kind necessary weakly LU -
efficiency condition for NIVP are obtained with right gH-differentiable
BCIFs in both multiple objective function and constraints functions.
Keywords: Interval-valued function · Arcwise connected · Optimality

condition · Interval optimization · Hukuhara difference
1 Introduction
In optimization, convexity plays an important role in finding global solutions

[4] where every local solution is a global solution in convex programming prob-
lem. However, convex programming problems suffer from limited applicability
to practical problems. The generalization of convex optimization results in some
nonconvex optimization problem which is the main objective of current work.
In particular, the concept of arcwise connectivity, which is a special case of
generalized convexity, makes it possible to derive global optimality results for
some nonconvex optimization. Arcwise connected set and function was intro-
duced by Avriel and Zang [2] with the investigation of extremum properties of
The research of Mohan Bir Subba is supported by Council of Scientific and Industrial
Research (CSIR), New Delhi, India through File No.: 09/1191(0001)/2017-EMR-I.
https://doi.org/10.1007/978-3-030-21803-4_65
650 M. B. Subba and V. Singh
the programming problems under the function. While Singh [13] considered both
differential and non differentiable functions for investigating the basic properties
of arcwise connected set and functions. Bhatia and Mehra [3] derived the opti-
mality conditions of nonlinear programming problems under arcwise connected
functions with the help of directional derivatives while Davar and Mehra [6] con-
sidered the fractional optimization problems and establish optimality conditions
and duality results. B-arcwise connected function was developed and studied by
Zhang [20] giving optimality conditions and the duality results of the nonlinear
semi-infinite programming problems. Optimization problems also have to deal
with uncertainty problems. In order to solve this, different methods viz. interval
numbers, fuzzy numbers and stochastic process have been developed. In recent
years, many researchers have contributed to the area of interval optimization
([5,8]). Hukuhara derivatives was introduced by Hukuhara [7] for solving interval
optimization which was further generalized by Stefanini and Bede [14] to gen-
eralized Hukuhara derivative (gH-derivative). Gómez et al. [10] considered the
generalized differentiable interval-valued functions and obtained its optimality
conditions. Interval optimization problems involving single and multiobjective
functions were rigorously studied by Wu ([17–19]) resulting in the derivation of
the KKT-optimality conditions and Wolfe duality. Wang and Zhang [16] derived
the same by introducing arcwise connected interval-valued function. Interval-
valued invex function was introduced by Li et al. [9] to derive sufficient opti-
mality condition using gH-differentiability. Recently, a new type of efficiency
conditions were introduced by Osuna-Gómez et al. [11] for interval optimiza-
tion problems considering generalized smooth multiobjective convex functions
with gH-derivative. Antczak [1] obtained the Mond-Weir duality results and the
conditions which are necessary and sufficient for weakly LU -efficient solutions
to be optimal in nondifferentiable multiobjective programming problems with
multiple interval-valued convex functions.
This paper deals with a new type of nonlinear interval vector programming
problem (NIVP) where both objective and constraints are B-arcwise connected
interval-valued functions (BCIFs). Current work is designed in the following
fashion. Section 2 is dedicated to the introduction of BCIFs, right gH-derivatives
of BCIFs (gHBCIFs) and its properties for deriving a global optimality criterion.
While in Sect. 3, Fritz-John kind and KKT-kind necessary weakly LU -efficiency
conditions for NIVP with gHBCIF in both multiple objective and constraints
have been presented. The conclusion and further outlook of the presented paper
are discussed in Sect. 4.
2 Preliminaries
Throughout the paper, we shall consider Ic a set of closed interval in IR and for
any closed set C ⊂ Ic we shall denote it by C = [c, c] such that c ≤ c where c is
the lower bound and c is the upper bound of C. If for any two arbitrary closed
intervals C, D ∈ Ic and a real number μ ∈ IR, then we have by definition;
Necessary Optimality Condition for NIVP Under BCNs 651

[μc, μc] , if μ ≥ 0
C + D = c + d, c + d ⊂ Ic and μC = μ [c, c] = (1)
[μc, μc] , if μ < 0
From these two equations, we defined the difference between C, D ∈ Ic called

the Minkowski difference as follows;

C − D = C + (−D) = [c, c] + −d, −d = c − d, c − d (2)
The set Ic does not form a linear space under the defined binary operation since
the inverse element does not exist ([5,14]). To solve this problem, Hukuhara ([7])
introduced Hukuhara difference but it does not solve the problems completely.
Thus Stefanini and Bede [14] introduced the generalized Hukuhara difference
(gH-difference) between any two closed intervals C, D ⊂ Ic which is denoted
by gH and is defined by

C = D + A, or
C gH D = A ⇔ (3)
D = C + (−A)

where A ∈ Ic . Moreover, C gH D = min{c − d, c − d}, max{c − d, c − d}
does always exist.
Let F : X → Ic be a function
defined on the non-empty set X ∈ IRn
such that F (x) = f (x), f (x) then, F is called interval-valued function, where
f , f : X → IR are respectively called lower and upper real valued function of F
such that f (x) ≤ f (x), ∀ x ∈ X.
Definition 1. [2] The set X ⊂ IRn is called an arcwise connected (AC) set if,
for every x, y ∈ X there exists a continuous vector valued function Hx,y ⊂ X
called an arc connecting x and y, that is defined on the closed interval [0, 1] ⊂ IR
such that, Hx,y (0) = x and Hx,y (1) = y.
In [2] it is shown that every convex set in IRn is arcwise connected set in IRn . Thus
AC set is a generalized convex set. Example of B-arcwise connected functions is
discussed by Zhang [20].
Definition 2. [2] Let f : X → IR be a real valued function defined on AC
set X ⊂ IRn . Then f is called an arcwise connected function (strictly arcwise
connected function) if,
f (Hx,y (ψ)) ≤ (<)(1 − ψ)f (x) + ψf (y), ∀ x, y ∈ X (4)
where Hx,y is arc in X and 0 ≤ ψ ≤ 1(0 < ψ < 1).
Definition 3. [20] Let X ⊂ IRn be an AC set and f : X → IR is a real valued
function defined on X. Then f is called B-arcwise connected function (strictly
B-arcwise connected function) if for all x, y ∈ X(x = y), there exists a real
valued function b(x, y, ψ) : X × X × [0, 1] → IR and an arc Hx,y such that
∀ 0 ≤ ψ ≤ 1, 0 ≤ b ≤ 1(∀ 0 < ψ < 1, 0 < b < 1) satisfying,
f (Hx ,y (ψ)) ≤ (<)(1 − ψb(x, y, ψ))f (x) + ψb(x, y, ψ)f (y) (5)
Definition 4. [16] Let F : X → Ic be an interval-valued function defined on

AC set X ⊂ IRn . Then F is called AC interval-valued function if,
F (Hx ,y (ψ)) ≤U
L (1 − ψ)F (x) + ψF (y), ∀ x, y ∈ X and 0 ≤ ψ ≤ 1 (6)
where Hx,y is arc in X.
Definition 5. Let X ⊂ IRn be a non-empty open AC set and F : X → Ic
is an interval-valued function defined on X. Then F is said to be a B-arcwise
connected interval-valued function (BCIF) (strictly B-arcwise connected interval-
valued function (SBCIF)) if, for all x, y ∈ X, there exists a real valued function
b(x, y, ψ) : X × X × [0, 1] → IR and an arc Hx ,y such that ∀ 0 ≤ ψ ≤ 1, 0 ≤ b ≤
1 (∀ 0 < ψ < 1, 0 < b < 1) satisfying,
F (Hx ,y (ψ)) ≤U U
L (<l )(1 − ψb(x, y, ψ))F (x) + ψb(x, y, ψ)F (y) (7)
Definition 6. Consider an interval-valued function F : X → Ic defined on
non-empty open AC set X ⊂ IRn . Let Hx,y ⊂ X be an arc for any x, y ∈ X
then F is said to have right generalized Hukuhara differential (gH-differential or
gH-derivative) with respect to Hx ,y at ψ = 0 if the following limit,
F (Hx ,y (ψ)) gH F (x)
F + (Hx,y (0)) = lim+ (8)
ψ→0 ψ
H (ψ)−x
exists and F + (Hx,y (0)) ∈ Ic . If Hx,y
+
(0) = limψ→0+ x,y ψ exists then the
+
vector Hx,y (0) is called directional derivative of Hx,y at ψ = 0.
If F is right gH-differential with respect to Hx ,y at ψ = 0 then, we get,
F + (Hx ,y (0)) = Hx,y
+
(0)∇F (x)T (9)
+
whenever F (Hx ,y (0)) and Hx+ ,y (0) exists at ψ = 0.
Definition 7. Let F : X → Ic be an interval-valued function defined on non-
empty open AC set X ⊂ IRn . Then F is called right gH-derivative B-arcwise con-
nected interval-valued function (gHDBCIF) (strictly gHDBCIF (SgHDBCIF)) if,
for all x, y, ∈ X(x, y ∈ X, x = y) there exists a real valued function b(x, y, ψ)
and arc Hx,y ⊂ X, such that F posses right gH-derivative with respect to arc
Hx,y at ψ = 0 and b+ (x, y) = limψ→0+ b(x, y, ψ).
Theorem 1. If F : X → Ic is gHDBCIF defined on non-empty open AC set
X ⊂ IRn , then for all 0 < ψ < 1 and b+ (x, y) = limψ→0+ b(x, y, ψ) it follows,
F + (Hx,y (0)) ≤U
L b (x, y) [F (y) gH F (x)]
+
(10)
Proof. Since F is gHDBCIF, then for all x, y ∈ X there exists a real valued
function b(x, y, ψ) and an arc Hx,y such that,
F (Hx,y (ψ)) ≤U
L (1 − ψb(x, y, ψ)) F (x) + ψb(x, y, ψ)F (y), ∀ 0 < ψ < 1
F (Hx,y (ψ)) gH F (x) U
⇒ ≤L b(x, y, ψ) [F (y) gH F (x)]
ψ
⇒ F + (Hx,y (0)) ≤U
L b (x, y) [F (y) gH F (x)]
+
by letting ψ → 0+ .

Theorem 2. (Global Optimality Criterion) Consider an interval-valued func-

tion F : X → Ic which is gHBCIF on non-empty open AC set X ⊂ IRn
and b+ (
x, y) = limψ→0+ b( , y ∈ X. If x
x, y, ψ) > 0 for x is a point such that
∇F (x) = [0, 0], then F has a global minimum point at x ∈ X.
Proof. Since F is gHBCIF and ∇F ( x) = [0, 0], thus there exists a real valued
x, y) and an arc Hx,y for every y ∈ X such that,
function b(
b+ ( x)] ≥U
x, y) [F (y) gH F ( +
L F (Hx
+
,y (0)) = Hx x)T = [0, 0]
,y (0)∇F (
x, y)F (y) ≥U
⇒ b+ ( +
L b (
x, y)F (
x)
⇒ F (y) ≥U
L f (
x)
since b+ ( ∈ X.
x, y) > 0, thus F has a global minima at x

Theorem 3. Consider an interval-valued function F : X → Ic which is SgHB-
CIF on non-empty open AC set X ⊂ IRn and b+ ( x, y) = limψ→0+ b(
x, y, ψ) > 0
, y ∈ X. If x
for x is a point such that ∇F (
x) = [0, 0], then F has unique global
minimum point at x ∈ X.
Proof. Following the proof of (theorem 2) we can conclude that F has a global
minimum point at x ∈ X.
For the proof of uniqueness, we follow the method of contradiction. If possible,
let F has a global minimum point at x1 ∈ X such that, F ( x) = F (x1 ), for x =
1
x . Since F is SBCIF and x , x ∈ X then there exists a real valued function
1
b(x, x1 , ψ) and an arc Hx,x1 such that,
F (Hx,x1 (ψ)) <U

L (1 − ψb(
x, x1 , ψ))F ( x, x1 , ψ)F (x1 ) = F (
x) + ψb( x)
⇒ F (Hx,x1 (ψ)) <U
L F (
x)
for 0 < ψ < 1, which is a contradiction that F has a global minimum point at
∈ X. Thus x
x is the unique global minimum point of F .

3 Optimality Conditions
In this paper, we consider the following nonlinear interval vector programming
problem (NIVP) under multiobjective interval-valued function

Min F (x) = (F1 (x), F2 (x), ..., Fm (x))
(N IV P ) = x∈X
subject to Gj (x) ≤U L [0, 0], j = 1, 2, ..., p
where each Fi , Gj : X → Ic , (i ∈ M = {1, 2, ..., m}, j ∈ P = {1, 2, ..., p})

are interval-valued functions defined on non-empty open AC subset X ⊂ IRn .
Moreover, Fi (x), Gj (x), (i ∈ M, j ∈ P ) posses right gH-derivative with respect
to the same arc Hx,y at ψ = 0 for every x, y ∈ X. The set defined by

S = x ∈ X; Gj (x) ≤U L [0, 0], j ∈ P
denotes the set of feasible solution of NIVP and we defined I( x) the set of
constraints indices which is active at the feasible point x ∈ S i.e. I(
x) = {j ∈
P : Gj (
x) = [0, 0]} and J( x) <U
x) = {j ∈ P : Gj ( L [0, 0]}.
Definition 8. [19] Let F (x) = (F1 (x), ..., Fm (x)) be a multiobjective interval-
valued function defined on non-empty open AC set X ⊂ IRn such that Fi : X →
∈ S is called
Ic . Then x
(i) weakly LU -efficient solution of NIVP if and only if there exists no feasible
point x ∈ S such that Fi (x) <U x) for every i ∈ M .
L Fi (
(ii) LU -efficient solution of NIVP if and only if there exists no feasible point
x ∈ S such that Fi (x) U x) and Fi (x) <U
L Fi ( x) for at least one i ∈ M .
L Fi (
Lemma 1. Let X ⊂ IRn be a non-empty open AC set and Gj : X → Ic , j ∈ P

be a BCIF with respect to x ∈ X. Then exactly one of the following system of
equation is solvable,
U
Gj (x) <L [0, 0], ∀ j ∈ P
(i) there exists x ∈ X such that
(ii) there exists μj = μj , μj ∈ R2 and μj , μj ≥ (0, 0), ∀ j ∈ P , such that
⎛ ⎞

p
p p

μj Gj (x) ≥U
L [0, 0] i.e.
⎝ μj g j (x), μj gj (x)⎠ ≥U
L [0, 0], ∀ x ∈ X.
j=1 j=1 j=1
Proof. Let us consider (i) has a solution. Since Gj : X → Ic , j ∈ P is BCIF,

thus for every x, y ∈ X, there exists a real valued function b(x, y, ψ) and an arc
Hx,y (ψ) such that,
Gj (Hx,y (ψ)) ≤U
L (1 − ψb(x, y, ψ)) Gj (x) + ψb(x, y, ψ)Gj (y) (11)
for 0 ≤ ψ ≤ 1 and Hx,y (ψ) ⊂ X. Now for any (0, 0) ≤ μj ∈ IR2 , then by Eq. (11)
we get,

p
μj Gj (Hx,y (ψ)) <U
L [0, 0] (12)
j=1
Since Gj (x) <U

[0, 0], ∀ x ∈ X. Now, if possible consider (ii) has a solution,
L

p
then there exists (0, 0) ≤ μj ∈ IR2 such that μj Gj (Hx,y (ψ)) ≥U
L [0, 0] for
j=1
an arc Hx,y (ψ) ∈ X, which is a contradiction to Eq. (12). Hence (ii) has no
solution.
Similarly, if we consider (ii) has a solution then following the same procedure
we will arrive at the same conclusion that (i) cannot have a solution. Thus there
exists no x ∈ X which solves both (i) and (ii).

Theorem 4. (Fritz-John Kind Necessary Optimality Condition:) Let X

be a non-empty AC set and x ∈ S be a weakly LU -efficient solu-
tion of NIVP. If Gj (x)(j ∈ J( ∈ X and
x)) are continuous at x
Fi+ Hx,x (0))(i ∈ M ), G+ ,x (0) (j ∈ I(
j (Hx x)) are gHDBCIF of x ∈ X with

respect to the same arc Hx,x at ψ = 0. Then there exists μ∗i = μ∗i , μ∗i ∈ IR2

and μj = μj , μj ∈ IR2 such that,
m

p

μ∗i Fi+ (Hx,x (0)) + μj G+

j (Hx
U
,x (0)) ≥L [0, 0], ∀x∈X (13)
i=1 j=1
p

μj Gj (
x) = [0, 0] (14)
j=1

μ∗i , μ∗i , μj , μj ≥ (0, 0, 0, 0) (15)
Proof. Since x ∈ S is a weakly LU -efficient solution of NIVP. Thus there exists

no x ∈ S = {x ∈ X : Gj (x) ≤U U
L [0, 0], j ∈ P } such that Fi (x) <L Fi (
x), ∀ i ∈ M
holds i.e. there exists no x ∈ X which satisfy the following system of equation,
Fi+ (Hx,x (0)) <U
L [0, 0], i∈M and G+
j (Hx
U
,x (0)) <L [0, 0], j∈P (16)
Let us consider there exists a solution x ∈ X which satisfy (16). Given that
Fi (x) (i ∈ M ) and Gj (x) (j ∈ I( x)) are gHDBCIF at x with respect to the
same arc Hx,x at ψ = 0, thus from the (Definition 6), we have

Fi (Hx,x (ψ)) gH Fi (
x) = ψFi+ (Hx,x (0)) + ψ β i (ψ), β i (ψ) , i ∈ M (17)

Gj (Hx,x (ψ)) gH Gj ( x) = ψG+
j (Hx,x (0)) + ψ β j (ψ), β j (ψ) , j ∈ I(
x) (18)
where
lim β i (ψ), β i (ψ) = [0, 0], lim β j (ψ), β j (ψ) = [0, 0] (19)
ψ→0+ ψ→0+
For sufficiently small ψ, 0 < ψ < ψ < 1 and from Eqs. (16)–(19), we have
Fi (Hx,x (0)) <U x) , i ∈ M
L Fi ( and Gj (Hx,x (0)) <U x) , j ∈ I(
L Gj ( x) (20)
Since the arc function Hx,x (ψ) is continuous at ψ and Gj (Hx,x (ψ)) ∀ (j ∈ J(
x))
, thus
are continuous at x
x) <U
lim Gj (Hx,x (ψ)) = Gj ( L [0, 0], j ∈ J(
x)
ψ→0+
Therefore, there exists some ψi where 0 < ψi < 1, such that

Gi (Hx,x (ψ)) < 0, ∀ i ∈ J(
x) and 0 < ψ < ψi (21)
Considering ψ = min(ψ, ψi ) and from the Eqs. (20), (21) we obtained that for
Hx,x (ψ) ∈ S ⊂ X for 0 < ψ < ψ which implies that Fi (Hx,x (ψ)) < Fi ( x) which
is a contradiction since, x is a weakly LU -efficient solution of NIVP. Thus, the
Eq. (16) does not have solution x ∈ X.
Since Fi+ (Hx,x (0)), i ∈ M and G+ (Hx,x (0)), j ∈ P are BCIFs of x, thus
j
∗ ∗
from (Lemma 1), there exists μi , μi ∈ IR2 and μj , μj ∈ IR2 such that Eqs.
(13)–(15) are satisfied at x , hence the proof of the theorem.

Theorem 5. (Fritz-John Kind Necessary Optimality Condition:) Let X is a

non-empty open AC set and x ∈ S be a weakly LU -efficient solution of NIVP. If
Fi (x), (i ∈ M ) and Gj (x), (j ∈ P ) are gHDBCIF of x with respect to the same
arc Hx,x
real valued function b : X × X × [0, 1] → IR, then there exists
and
μ∗i , μ∗i , μj , μj ∈ IR2 such that Eqs. (13)–(15) are satisfied at x
.
Proof. Since x ∈ S is a weakly LU -efficient solution of NIVP. Thus there exists

no x ∈ S = {x ∈ X : Gj (x) ≤U U
L [0, 0], j ∈ P } such that Fi (x) <L Fi (
x), (i ∈ M )
holds i.e. there exists no x ∈ X which satisfy the following system of equations,
x) <U
Fi (x) gH Fi ( L [0, 0], i ∈ M and Gj (x) <U
L [0, 0], j ∈ P (22)
Consider, F (x) = Fi (x) gH Fi ( x), we claim that F (x) is an BCIF of x. Since
each Fi (x) is gHDBCIF of x, therefore for every x1 , x2 ∈ X there exists a real
valued function b(x1 , x2 , ψ) : X × X × [0, 1] → IR and an arc Hx1 ,x2 (ψ) such
that,
F (Hx1 ,x2 (ψ)) = Fi (Hx1 ,x2 (ψ)) gH Fi (
x)
U

≤L 1 − ψb(x , x , ψ) Fi (x ) + ψb(x , x2 , ψ)Fi (x2 ) gH Fi (
1 2 1 1
x)

= 1 − ψb(x , x , ψ)
1 2
Fi (x ) gH Fi (
1
x) + ψb(x , x , ψ) Fi (x2 ) gH Fi (
1 2
x)
Thus F (x) is BCIF and since Gj (x), j∈ P are
also BCIFs. Hence
from (Lemma

1), there exists μi , μi ∈ IR and μj , μj ∈ IR where μ∗i , μ∗i , μj , μj ≥
∗ ∗ 2 2
(0, 0, 0, 0) such that

m

p

μ∗i (Fi (x) gH Fi (

x)) + μj Gj (x) ≥U
L [0, 0], ∀ i ∈ M and j ∈ P (23)
i=1 j=1
in the above Eq. (23), we get

Putting x = x
p

x) ≥U
μj Gj ( L [0, 0], ∀ j ∈ P (24)
j=1

Moreover, x ∈ S is a feasible solution of NIVP, thus for all μ∗i , μ∗i ∈ IR2 and

μj , μj ∈ IR2 with Eq. (24) we get
p

x) = [0, 0], ∀ j ∈ P
μj Gj ( (25)
j=1
Thus for every x ∈ S and 0 < ψ < 1 there exists an arc Hx,x (ψ) in X. Now from
the Eqs. (23) and (25) by substituting x = Hx,x (ψ), we obtain
m

p

μ∗i (Fi (Hx,x (ψ)) gH Fi (

x)) + x) ≥U
μj (Gj (Hx,x (ψ)) gH Gj ( L [0, 0]
i=1 j=1
for all i ∈ M and j ∈ P . Dividing the above equation by ψ > 0 and taking limit
ψ → 0+ , we get Eq. (13). Thus the proof of the theorem.

Theorem 6. (Karush-Kuhn-Ticker Kind Necessary Optimality Condition:) Let

X is a non-empty open AC set and x ∈ S be a weakly LU -efficient solution of
NIVP such that Fi (x), (i ∈ M ) and Gj (x), (j ∈ P ) are gHDBCIFs of x with
respect to the same arc Hx,x and real valued function b : X × X × [0, 1] → IR
U ∗ ∗ U
such that b+ (
x, x) = limψ→0+ b( L 0. If x ∈ S be such
x, y, ψ) > that Gj (x ) <L
[0, 0], (j ∈ P ) then there exists (0, 0) < μ∗i , μ∗i ∈ IR2 and μj , μj ∈ IR2 such
that the Eqs. (13)–(15) are satisfied at x .

Proof. From (Theorem 5), we have μ∗i , μ∗i ≥ (0, 0) for i ∈ M and μj , μj ∈
R2 such that Eqs. (13)–(15) are satisfied .
at x
∗ ∗
If possible, let us consider μi , μi = (0, 0). Then from Eqs. (13) and (15)
we get
p

μj G+ ,x (0)) ≥ 0, ∀ x ∈ X
j (Hx and μj , μj > (0, 0), ∀ j ∈ P (26)
j=1

Since Gj (x) are gHDBCIFs and μj = μj , μj > (0, 0), thus we have
p

p

U +
μj G+
j (Hx x, x∗ )
,x∗ (0)) ≤L b ( μj [Gj (x∗ ) gH Gj (
x)] (27)
j=1 j=1
Therefore, from Eqs. (14), (26) and (27), we get

m

x, x∗ )
b+ ( μj Gj (x∗ ) ≥U
L [0, 0] (28)
j=1
x, x∗ ) > 0 therefore (28) is a contradiction to

Since b+ (
m

μj Gj (x∗ ) <U
L [0, 0] and μj , μj ≥ (0, 0)
j=1

which implies μ∗ , μ∗ > (0, 0), hence the proof of the theorem.

4 Conclusions
The class of BCIF and SBCIF under AC set have been studied. The right gH-
derivative with respect to Hukuhara difference between two closed intervals are
introduced for BCIF and its properties have been investigated under right gH-
derivative for a globally optimal solution. The Fritz-John kind and Karush-
Kuhn-Tucker kind necessary weakly LU -efficiency conditions are obtained for
NIVP involving gHBCIFs. To the best of our knowledge, the necessary weakly
LU -efficiency conditions are new in the area of NIVP under the set of differential
BCIF and SBCIF. Many practical problems can be treated by interval optimiza-
tion including electric energy market [12], sinter cooling process [15], etc. The
work can be further extended to establish sufficiency optimality condition and
duality results.
References
1. Antczak, T.: Optimality conditions and duality results for nonsmooth vector opti-
mization problems with the multiple interval-valued objective function. Acta Math.
Sci. 37B(4), 1133–1150 (2017)
2. Avriel, M., Zang, I.: Generalized arcwise-connected functions and characterization
of local-global minimum properties. J. Optim. Theory Appl. 32(4), 407–425 (1980)
3. Bhatia, D., Mehra, A.: Optimality conditions and duality involving arcwise con-
nected and generalized arcwise connected functions. J. Optim. Theory Appl.
100(1), 181–194 (1999)
4. Cambini, A., Martein, L.: Generalized Convexity and Optimization. Lecture Notes
in Economics and Mathematical Systems, vol. 616. Springer, Berlin (2009)
5. Chalco-Cano, Y., Rufian-Lizana, A., Rom án-Flores, H., Jim énez-Gamero, M.D.:
Calculus for interval-valued functions using generalized Hukuhara derivative and
applications. Fuzzy Sets Syst. 219, 49–67
6. Davar, S., Mehra, A.: Optimality and duality for fractional programming problems
involving arcwise connected functions and their generalizations. J. Math. Anal.
Appl. 263(2), 666–682 (2001)
7. Hukuhara, M.: Integration des applications mesurables dont la valeur est un com-
pact convexe. Funkc. Ekvacioj 10, 205–223 (1967)
8. Jana, M., Panda, G.: Solution of nonlinear interval vector optimization problem.
Oper. Res. Int. J. 14(1), 71–85 (2014)
9. Li, L., Liu, S., Zhang, J.: On interval-valued invex mappings and optimality condi-
tions for interval-valued optimization problems. J. Inequal. Appl. 179, 2–19 (2015)
10. Osuna-Gómez, R., Chalco-Cano, Y., Hernández-Jiménez, B., Ruiz-Garzón, G.:
Optimality conditions for generalized differentiable interval-valued functions. Inf.
Sci. 321, 136–146 (2015)
11. Osuna-Gómez, R., Hernández-Jiménez, B., Chalco-Cano, Y., Ruiz-Garzón, G.:
New efficiency conditions for multiobjective interval-valued programming prob-
lems. Inf. Sci. 420, 235–248 (2017)
12. Saric, A.T., Stankovic, A.M.: An application of interval analysis and optimization
to electric energy markets. IEEE Trans. Power Syst. 21(2), 515–523 (2006)
13. Singh, C.: Elementary properties of arcwise connected sets and functions. J. Optim.
Theory Appl. 41(2), 377–387 (1983)
14. Stefanini, L., Bede, B.: Generalized Hukuhara differentiability of interval-valued
functions and interval differential equations. Nonlinear Anal. 71(3–4), 1311–1328
(2009)
15. Tian, W., Ni, B., Jiang, C., Wu, Z.: Uncertainty analysis and optimization of sinter
cooling process for waste heat recovery. Appl. Therm. Eng. 150(1), 111–120 (2019)
16. Wang, H., Zhang, R.: Optimality conditions and duality for arcwise connected
interval optimization problems. Opsearch 52(4), 870–883 (2015)
17. Wu, H.C.: On interval-valued nonlinear programming problems. J. Math. Anal.

Appl. 338(1), 299–316 (2008)
18. Wu, H.C.: Wolfe duality for interval-valued optimization. J. Optim. Theory Appl.
138(3), 497–509 (2008)
19. Wu, H.C.: The Karush-Kuhn-Tucker optimality conditions in multiobjective pro-
gramming problems with interval-valued objective functions. Eur. J. Oper. Res.
196(1), 49–60 (2009)
20. Zhang, Q.: Optimality conditions and duality for semi-infinite programming involv-
ing B-arcwise connected functions. J. Glob. Optim. 45(4), 615–629 (2009)
On the Applications of Nonsmooth
Vector Optimization Problems to Solve
Generalized Vector Variational
Inequalities Using Convexificators
Balendu Bhooshan Upadhyay1(B) , Priyanka Mishra1 , Ram N. Mohapatra2 ,

and Shashi Kant Mishra3
1
Department of Mathematics, Indian Institute of Technology Patna,
Patna 801106, India
bhooshan@iitp.ac.in
2
University of Central Florida, Orlando, FL 32816, USA
ram.mohapatra@ucf.edu
3
Department of Mathematics, Institute of Science, Banaras Hindu University,
Varanasi 221005, India
bhu.skmishra@gmail.com
Abstract. In this paper, we employ the characterization for an approx-

imate convex function in terms of its convexificator to establish the rela-
tionships between the solutions of Stampacchia type vector variational
inequality problems in terms of convexificator and quasi efficient solution
of a nonsmooth vector optimization problems involving locally Lipschitz
functions. We identify the vector critical points, the weak quasi efficient
points and the solutions of the weak vector variational inequality prob-
lem under generalized approximate convexity assumptions. The results
of the paper extend, unify and sharpen corresponding results in the lit-
erature. In particular, this work extends and generalizes earlier works by
Giannessi [11], Upadhyay et al. [31], Osuna-Gomez et al. [30], to a wider
class of functions, namely the nonsmooth approximate convex functions
and its generalizations. Moreover, this work sharpens earlier work by
Daniilidis and Georgiev [5] and Mishra and Upadhyay [23], to a more
general class of subdifferentials known as convexificators.
Keywords: 49J15 · 58E17 · 58E35
1 Introduction
Nonsmooth phenomena occur naturally and frequently in optimization theory,
which led to the development of several notions of generalized directional deriva-
tives and subdifferentials. The notion of convexificator is a generalization of
The first author is supported by the Science and Engineering Research Board (SERB),
Department of Science and Technology, India under Early Career Reasearch (ECR)
advancement scheme through grant no. ECR/2016/001961.
https://doi.org/10.1007/978-3-030-21803-4_66
On Generalized Vector Variational Inequalities and Nonsmooth 661
some notions of known subdifferentials such as the subdifferentials of Clarke

[4], Michel-Penot [22], and Mordukhovich [27]. Convexificators are in general
closed sets unlike the well-known subdifferentials, which are convex and compact.
The notion of convex compact convexificator was introduced by Demyanov [7].
Demyanov and Jeyakumar [8] studied convexifcators for positively homogeneous
and locally Lipschitz functions. Jeyakumar and Luc [16] introduced noncompact
convexificators, which provide upper convex and lower concave approximations
for a continuous function. In [17], Jeyakumar and Luc generalized the idea of
convexificators from real valued functions to continuous vector valued functions.
They introduced the notion of approximate Jacobian for a continuous vector
valued function and established second order necessary and sufficient optimality
conditions for problems with single C 1 objective function. Dutta and Chandra [9]
derived the characterizations for pseudoconvex and quasiconvex functions and
established necessary optimality conditions for an inequality constrained opti-
mization problem. The inter-relation between several constraint qualifications
and necessary and sufficient optimality conditions for nonsmooth multiobjective
optimization problems in terms of convexificators has been studied by Golestani
and Nobakhtian [13], Li and Zhang [18], Lu [20] and Long and Huang [19].
In optimization theory, convexity is a very important hypothesis as a local
solution of a minimization problem becomes a global solution in its presence.
To provide a more accurate representation, modeling and solutions of real world
problems, several generalizations of convex functions have been introduced. For
details about generalized convex functions and their properties, we refer to
Mishra and Upadhyay [24], Mishra et al. [26] and the references cited therein.
Ngai et al. [28] introduced the concept of approximate convex functions, which
are stable under the finite sum and finite suprema and also most of the known
subdifferentials such as Clarke, Loffe and Mordukhovich coincide for these func-
tions. The main feature of this class of functions is that it includes the classes
of convex functions, weakly convex functions, strongly convex functions of order
m, m ≥ 1 and strictly convex functions. Daniilidis and Georgieve [5] and Ngai
and Penot [29] have studied the monotonicity properties of the Clarke subdiffer-
ential of locally Lipschitz approximate convex function and lower semicontinuous
approximate convex function, respectively. Recently, several generalizations of
approximate convex functions have been introduced using Clarke subdifferential
by Bhatia et al. [3] and Gupta et al. [14].
It is well-known that in most of the real world problems, the optimization
algorithms terminate in a finite number of steps and give only approximate
solutions, hence, it is useful to study the approximate solutions analytically as
well as computationally. Further, the concept of approximate solutions can be
considered as a satisfactory compromise to the efficient values of the objective
of a vector optimization problem with prescribed error, for details, we refer to
[6,15] and the references cited therein. The concept of approximate efficiency
may be given more flexibility by making the error dependent on the decision
variables. This led to the development of the idea of quasi efficiency. Recently,
several researchers have shown interests in the study of quasi efficient solutions
662 B. B. Upadhyay et al.
of vector optimization problems in finite as well as infinite dimensional spaces,

see [3,14,15] and the references cited therein. Dutta and Vetrivel [10] defined the
notion of weak quasi efficiency and obtained necessary and sufficient optimality
conditions for nonsmooth multiobjective optimization problems.
In 1980, Giannessi [11] has introduced the vector valued version of Stam-
pacchia variational inequality in finite dimensional Euclidean spaces and gave
some application to alternative theorems. One of an important generalizations
of Minty type variational inequalitiies is Minty vector variational inequality,
see [12]. Recently, Stampacchia and Minty vector variational inequality prob-
lems have been used as an efficient tool to study vector optimization problems.
In fact, some recent works in vector optimization have shown that optimality
conditions for vector optimization problems can be characterized by vector vari-
ational inequalities, for both the smooth and the nonsmooth cases, we refer
to (see, e.g. [1,2,12,25,30]) and references cited therein. Yang and Zheng [33]
obtained the necessary and sufficient optimality conditions for a feasible point to
be an approximate solution of a vector variational inequality problem. Recently,
Mishra and Upadhyay [23] and Upadhyay et al. [31] have established the relation-
ships between vector variational inequalities and nonsmooth vector optimization
problems using the notion of quasi efficiency and strong efficiency, respectively.
2 Definitions and Preliminaries

Let Rn be n-dimensional Euclidean space, Rn+ be the nonnegative orthant of

Rn and int(Rn+ ) be the positive orthant of Rn . Let R := R {∞} denotes the
extended real line and let ., . denote the Euclidean inner product. Let [x, y] and
]x, y[ denote the closed and open line segments joining x and y, respectively. Let
K ⊆ Rn be a nonempty set equipped with the Euclidean norm . and co(K)
denote the convex hull of K.
Throughout the paper, for vectors x, y ∈ Rn , we adopt the following conven-
tions for inequalities:
x > y ⇔ xi > yi , ∀ i = 1, 2, . . . , n ⇔ x − y ∈ int(Rn+ );

x y ⇔ xi ≥ yi , ∀ i = 1, 2, . . . , n ⇔ x − y ∈ Rn+ ;
x ≥ y ⇔ xi ≥ yi , ∀ i = 1, 2, . . . , n, but x = y ⇔ x − y ∈ Rn+ \{0};
We recall the following definitions from Jeyakumar and Luc [16]:

Definition 1. Let f : K → R be an extended real valued function, x ∈ K and
f (x) be finite. The lower and upper Dini derivatives of f at x, in the direction
v ∈ Rn are defined, respectively, as follows:
f (x + tv) − f (x)
f − (x; v) := lim inf ;
t↓ 0 t
f (x + tv) − f (x)
f + (x; v) := lim sup .
t↓ 0 t
Definition 2. Let f : K → R be an extended real valued function, x ∈ K and

f (x) be finite.
(i) The function f is said to have an upper convexificator ∂ ∗ f (x) ⊆ Rn at
x ∈ K, if and only if ∂ ∗ f (x) is closed and for each v ∈ Rn , one has
f − (x; v) ≤ sup x∗ , v .
x∗ ∈∂ ∗ f (x)
(ii) The function f is said to have a lower convexificator ∂∗ f (x) ⊆ Rn at x ∈ K,

if and only if ∂∗ f (x) is closed and for each v ∈ Rn , one has
f + (x; v) ≥ inf x∗ , v .
x∗ ∈∂∗ f (x)
(iii) The function f is said to have a convexificator ∂∗∗ f (x) ⊆ Rn at x ∈ K, if

and only if ∂∗∗ f (x) is both upper and lower convexificator of f at x.
This means that, for each v ∈ Rn , one has
f − (x; v) ≤ sup x∗ , v , f + (x; v) ≥ inf x∗ , v .
x∗ ∈∂∗∗ f (x) x∗ ∈∂∗∗ f (x)
These definitions and properties can be extended to a locally Lipschitz vector

valued function f : K → Rm . Denote by fi , i ∈ M := {1, 2, . . . , m} ; the compo-
nents of f. The convexificator of f at x, is the set
∂∗∗ f (x) := ∂∗∗ f1 (x) × · · · × ∂∗∗ fm (x).
Now onwards, we assume that the set K is nonempty, closed and convex set
unless otherwise specified.
In terms of convexificator, the notion of ∂∗∗ -convex functions may be defined
as follows:
Definition 3. Let f : K → R be a real valued function. Suppose that f is locally
Lipschitz at x ∈ K and admits a bounded convexificator ∂∗∗ f (x) at x. Then, f is
said to be ∂∗∗ -convex function at x, if and only if for any x ∈ K and x∗ ∈ ∂∗∗ f (x),
one has
f (x) − f (x) ≥ x∗ , x − x .
In terms of convexificators, we define the ∂∗∗ -approximate convex functions
as follows:
Definition 4. Let f : K → R be a real valued function. Suppose that f is
locally Lipschitz at x ∈ K and admits a bounded convexificator ∂∗∗ f (x) at x. The
function f is said to be ∂∗∗ -approximate convex (strictly ∂∗∗ -approximate convex)
at x, if for all α > 0, there exist δ > 0, such that for all x∗ ∈ ∂∗∗ f (x), one has
f (x) − f (x) ≥ (>) x∗ , x − x − α x − x , (1)

∀x ∈ B(x, δ) K, (x = x),
where B(x, δ) denote the open ball with centre at x and radius δ. Moreover, we
say that f is ∂∗∗ -approximate convex on K if it is ∂∗∗ -approximate convex at every
x ∈ K.
The following proposition from [25] gives a necessary and sufficient conditions
for a locally Lipschitz function to be a ∂∗∗ -approximate convex function.
Proposition 1. Let f : K → R be a locally Lipschitz function on K and for
any x ∈ K admit bounded convexificators ∂∗∗ f (x). Then, f is ∂∗∗ -approximate
convex at x ∈ K, if and only if for all α > 0, there exists δ > 0, such that for
any x, y ∈ B(x, δ) ∩ K, and λ ∈ [0, 1] , one has
f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y) + αλ(1 − λ) x − y . (2)
Following on the lines of Proposition 1, we can establish the following result:
Proposition 2. Let f : K → R be a locally Lipschitz function on K and for any

x ∈ K admit a bounded convexificator ∂∗∗ f (x). Then, f is strictly ∂∗∗ -approximate
convex at x ∈ K, if and only if for any α > 0 there exists δ > 0, such that for
any x, y ∈ B(x, δ) ∩ K, x = y and λ ∈ [0, 1] , one has
f (λx + (1 − λ)y) < λf (x) + (1 − λ)f (y) + αλ(1 − λ) x − y .
This result led to the following generalization of the ∂∗∗ -approximate convex
functions:
Definition 5. Let f : K → R be a locally Lipschitz function on K and admits a
bounded convexificator ∂∗∗ f (x) at x. The function f is said to be ∂∗∗ -approximate
pseudoconvex function of type I at x ∈ K, if for all α > 0, there exist δ > 0, such
that for all x ∈ B(x, δ) K, one has
∃x∗ ∈ ∂∗∗ f (x) : x∗ , x − x ≥ 0 ⇒ f (x) − f (x) ≥ −α x − x ;
or equivalently,
f (x) < f (x) − α x − x ⇒ x∗ , x − x < 0, ∀x∗ ∈ ∂∗∗ f (x).
We consider the following nonsmooth vector optimization problem:
(N V OP ) min f (x) := (f1 (x), f2 (x), ..., fm (x))
subject to x ∈ K,
where fi : K → R, i ∈ M are non-differentiable, locally Lipschitz, and bounded
below functions on K.
The notions of local quasi efficient and local weak quasi efficient solutions to
the (NVOP) may be defined as follows, see [1,11].
Definition 6. A point x ∈ K is said to be a local quasi efficient solution to the
there exists α ∈ int(Rp+ ) and a neighbourhood U of x, such that for
(NVOP) if
any x ∈ K U, one has
f (x) − f (x) + α x − x = (f1 (x) − f1 (x) + α1 x − x , ...,
/ −Rm
fm (x) − fm (x) + αm x − x) ∈ + \ {0} .
Definition 7. A point x ∈ K is said to be a local weak quasi efficient solution

m
exists α ∈ int(R+ ) and a neighbourhood U of x, such
to the (NVOP) if there
that for any x ∈ K U, one has
f (x) − f (x) + α x − x = (f1 (x) − f1 (x) + α1 x − x , ...,
/ −int(Rm
fm (x) − fm (x) + αm x − x) ∈ + ).
Remark 1. It is obvious by the definitions that every local efficient solution (local
weak efficient solution) is local quasi efficient solution (local weak quasi efficient
solution) to the (NVOP), but the converse may not be true, see [3] and Mishra
and Upadhyay [23].
We consider the following Stampacchia type vector variational inequality

problems in terms of convexificators:
(∂∗∗ -SVVIP) Find a point x ∈ K, and for any x∗i ∈ ∂∗∗ fi (x), i ∈ M, there exist
no x ∈ K, such that
(x∗1 , x − x , ..., x∗m , x − x) ∈ −Rm

+ \ {0} ,
(∂∗∗ -WSVVIP)Find a point x ∈ K, and for any x∗i ∈ ∂∗∗ fi (x), i ∈ M, there exist
no x ∈ K, such that
(x∗1 , x − x , ..., x∗m , x − x) ∈ −int(Rm

+ ),
3 Relationships Between Vector Variational Inequality

Problems and Nonsmooth Vector Optimization
Problem
In this section, employing the tools of convexificators and the characterizations

for ∂∗∗ -approximate convex functions and its generalization, we establish relation-
ships between vector variational inequality problems (∂∗∗ -SVVIP), (∂∗∗ -WSVVIP)
and the nonsmooth vector optimization problem (NVOP) using the the notion
of local quasi efficiency and local weak quasi efficiency.
Theorem 1. Let K be a nonempty convex subset of Rn and let f : Rn → Rm
be vector valued function such that fi : Rn → R is locally Lipschitz function on
K and for any x ∈ K, admit bounded convexificators ∂∗∗ fi (x), for all i ∈ M.
Let f i : K → Rm be ∂∗∗ -approximate convex function at x ∈ K. If x solves the
(∂∗∗ -SVVIP), then x is a local quasi efficient solution to the (NVOP).
Proof Suppose that x is not a local quasi efficient solution tothe (NVOP), then
for any α ∈ int(Rm
+ ) and any δ > 0, there exists x ∈ B(x, δ) K, such that
f (x) − f (x) + α x − x ∈ −Rm

+ \ {0} ,
that is, for all i ∈ M,

fi (x) ≤ fi (x) − αi x − x ,
with the strict inequality for some k ∈ M.
Invoking the ∂∗∗ -approximate convexity
of fi at x, in particular for αi > 0, i ∈
M, there exist δ > 0 and x ∈ B(x, δ) K, such that
x∗i , x − x ≤ 0, ∀x∗i ∈ ∂∗∗ fi (x),
with the strict inequality for some k ∈ M.
Therefore, for all x∗i ∈ ∂∗∗ fi (x), i ∈ M, there exist x ∈ X, such that
(x∗1 , x − x , ..., x∗m , x − x) ∈ −Rm
+ \ {0} .
Hence, x cannot be a solution to the (∂∗∗ -SVVIP), which is a contradiction.

Remark 2. We note that, (∂∗∗ -SVVIP) is not a necessary optimality condition
for a local quasi efficient solution to the (NVOP). This fact may be illustrated
with the help of the following example.
We consider the following nonsmooth vector optimization problem (P):
(P ) min f (x) := (f1 (x), f2 (x)), subject to x ∈ K := [−π, π].
where f1 , f2 : K → R are given by:

−x if − π ≤ x ≤ 0, −2x if − π ≤ x ≤ 0,
f1 (x) = and f2 (x) =
x3 − x2 if 0 ≤ x ≤ π −sinx if 0 ≤ x ≤ π.
Evidently, we can show that the function f = (f1 , f2 ) is ∂∗∗ -approximate

convex function
√
at x =√ 0, for all α = (α1 , α2 ) ∈ int(R2+ ) and δ =
π −1+ 1+1.5α1 −1+ 1+1.5α2
min( 3 , 3 , 3 ) > 0. Further, we can verify that x = 0 is a
local quasi efficient solution √
for the problem√
(P) as for all α = (α1 , α2 ) ∈ int(R2+ ),
there exist δ = min( π3 , −1+ 1+1.5α
3
1
, −1+ 1+1.5α
3
2
) > 0, such that the following
condition cannot hold:
f (x) ≤ f (x) − α x − x| .
Let x = 0, then for x = 13 , one has
(x∗1 , x − x , x∗2 , x − x) ≤ (0, 0), ∀x∗1 ∈ ∂∗∗ f1 (x) = {(−2, 0)} and, x∗2 ∈ ∂∗∗ f2 (x) = {(−2, −1)} .
Hence, x is not a solution of (∂∗∗ -SVVIP).

However, for the weak vector variational inequality problem (∂∗∗ -WSVVIP),
the following necessary and sufficient optimality condition for local weak quasi
efficiency holds.
Theorem 2. Let fi : K → R be locally Lipschitz function on K and for any
x ∈ K, admit bounded convexificators ∂∗∗ fi (x), for all i ∈ M. If x ∈ K is a local
weak quasi efficient solution to the (NVOP), then x solves the (∂∗∗ -WSVVIP).
Conversely, let fi : K → R, i ∈ M be ∂∗∗ -approximate convex function at x ∈ K
and x solves the (∂∗∗ -WSVVIP), then x is a local weak quasi efficient solution to
the (NVOP).
Proof Let x ∈ K be a local weak quasi efficient solution to the (NVOP). Hence,
there exists α ∈ int(Rp+ ) and a δ > 0, such that for any x ∈ B(x, δ) K, one has
/ −int(Rm
f (x) − f (x) + α x − x ∈ + ).

Since Kis a convex set, for any t ∈ [0, 1] , and x ∈ B(x, δ) K, x + t(x − x) ∈
B(x, δ) K. Therefore, for any α ∈ int(Rm + ), 0 < t < 1 and x ∈ B(x, δ) K, it
follows that
f (x + t(x − x)) − f (x) + αt x − x
∈/ −int(Rm
+ ).
t

Taking the limit inf as t → 0, for any x ∈ B(x, δ) K, we get

f − (x, x − x) := f1− (x, x − x), ..., fm
−
(x, x − x) ∈/ −int(Rm
+ ).
∗
i admit bounded convexificator ∂∗ fi (x), for all i ∈ M, for any x ∈
Since f
B(x, δ) K, we infer that
(x∗1 , x − x , ..., x∗m , x − x) ∈

/ −int(Rm ∗ ∗
+ ), ∀xi ∈ ∂∗ fi (x), i ∈ M.
Hence, x is a solution of (∂∗∗ -WSVVIP).

Conversely, suppose to the contrary that x is not a local weak quasi efficient
the (NVOP), then for any α ∈ int(Rm
solution to + ) and any δ > 0, there exists
x ∈ B(x, δ) K, such that
f (x) − f (x) + α x − x ∈ −int(Rm

+ ),
that is, for each i ∈ M, we have
fi (x) − fi (x) + αi x − x < 0.
Since fi , i ∈ M is ∂∗∗ -approximate convex function

at x, therefore, in particular
for αi > 0, there exist δ > 0 and x ∈ B(x, δ) K, such that
(x∗1 , x − x , ..., x∗1 , x − x) ∈ −int(Rm ∗ ∗

+ ), ∀xi ∈ ∂∗ fi (x), i ∈ M.
Therefore, x cannot be a solution to the (∂∗∗ -WSVVIP). This contradiction leads

to the result.
x ∈ K, admit bounded convexificators ∂∗∗ fi (x), for all i ∈ M. Let fi : K → R, i ∈
M be strictly ∂∗∗ -approximate convex function at x ∈ K and x is a local weak
quasi efficient solution to the (NVOP), then x is a local quasi efficient solution
to the (NVOP).
Proof Suppose that x is not a local weak quasi efficient solution to the (NVOP),
but a local quasi efficient solution to the (NVOP).
Hence, for any α ∈ int(Rm
+)
and any δ > 0, there exists an x ∈ B(x, δ) K, such that
f (x) − f (x) + α x − x ∈ −Rm

+ \ {0} .
That is, for each i ∈ M,
fi (x) − fi (x) + αi x − x ≤ 0,
with one strict inequality k ∈ M.

Therefore, in particular, by the strictly ∂∗∗ -approximate
convexity of fi , i ∈ M
at x, for any αi > 0, there exist δ > 0 and x ∈ B(x, δ) K, x = x, such that
0 ≥ fi (x) − fi (x) + αi x − x > x∗i , x − x , ∀x∗i ∈ ∂∗∗ fi (x), i ∈ M,
which implies that there exists x ∈ K, such that
(x∗i , x − x , ..., x∗i , x − x) ∈ −int(Rm ∗ ∗

+ ), ∀xi ∈ ∂∗ fi (x).
Therefore, x cannot be a solution to the (∂∗∗ -WSVVIP). Then, by Theorem 5,

we get a contradiction. Hence, x is a local weak quasi efficient solution to the
(NVOP).
The following definition is a simple extension of the concept of vector critical
point for the differentiable case by Osuna-Gomez et al. [30] to the nonsmooth
case in terms of convexificators.
Definition 8. A feasible solution x ∈ K is said to be vector critical point to the
(NVOP), if there exist x∗ ∈ ∂∗∗ f (x) and a vector λ ∈ Rm , with λ ≥ 0, such that
λT x∗ = 0.
Lemma 1. Let x ∈ K be a vector critical point to the (NVOP) and let fi :

Rn → R be locally Lipschitz function on K and for any x ∈ K, admit bounded
convexificators ∂∗∗ fi (x), for all i ∈ M. Let fi : K → R, i ∈ M be ∂∗∗ -approximate
pseudoconvex function of type I at x ∈ K, then x is a local weak quasi efficient
solution to the (NVOP).
Proof Suppose x is a vector critical point to the (NVOP), that is, there exists
a vector λ ∈ Rm , with λ ≥ 0 and a vector x∗ ∈ ∂∗∗ f (x), such that
λT x∗ = 0.
If x is not a local weak quasi efficient solution to the (NVOP), then for any
α ∈ int(Rm+ ) and any δ > 0 there exists a x ∈ B(x, δ) K, such that
f (x) − f (x) + α x − x ∈ −int(Rm

+ ),
that is, for each i ∈ M,
fi (x) < fi (x) − αi x − x , ∀i ∈ M.
By our assumption fi , i ∈ M is ∂∗∗ -approximate pseudoconvex function of type

∈ K, therefore, in particular for αi > 0, and δ > 0, there exists x ∈
I at x
B(x, δ) K, x = x, such that
x∗i , x − x < 0, ∀x∗i ∈ ∂∗∗ fi (x), i ∈ M,

that is, for each i ∈ M,

(x∗1 , x − x , ..., x∗m , x − x) < 0, ∀x∗i ∈ ∂∗∗ fi (x).
Employing the Gordan theorem of alternative (see [21]), for every x∗ ∈ ∂∗∗ f (x),
the system below
λT x∗ = 0, λ ≥ 0, λ ∈ Rm ,
has no solution for λ, which is a contradiction to our assumption, that x is a
vector critical point to the (NVOP).
x ∈ K, admit bounded convexificators ∂∗∗ fi (x), for all i ∈ M. Any vector critical
point to the (NVOP) is a local weak quasi efficient solution to the (NVOP) if
and only if fi : K → R, i ∈ M is a ∂∗∗ -approximate pseudoconvex function of
type I at that point.
Proof The sufficient condition follows from Lemma 1. We just need to prove
that if every vector critical point to the (NVOP) is a local weak quasi efficient
solution to the (NVOP), then the function fi , i ∈ M fulfills the condition for
∂∗∗ -approximate pseudoconvexity of type I at that point.
Let x be a local weak quasi efficient solution to the (NVOP). Then, there exists
α ∈ int(Rm + ) and any δ > 0, one has
f (x) − f (x) + αx − x ∈ −int(Rm

+ ).
Therefore, for each i ∈ M, the following system
fi (x) − fi (x) < −αi x − x (3)

has no solution x ∈ B(x, δ) K.
On the other hand, if x is a vector critical point to the (NVOP), then there
exist a vector λ ≥ 0, λ ∈ Rm and x∗ ∈ ∂∗∗ f (x), such that
λT x∗ = 0.
Employing the Gordan theorem of alternative (see [21]), there exists x∗i ∈
∂∗∗ fi (x), i ∈ M, such that the following system
x∗ u := (x∗1 , u , ..., x∗m , u < 0,
has no solution u ∈ Rn . Therefore, the following system
(x∗1 , u , ..., x∗m , u < 0, ∀x∗i ∈ ∂∗∗ fi (x), (4)
n
has no solution u ∈ R . Hence, the systems (3) and (4) are equivalent. Therefore,
if there exist α ∈ int(Rm + ), δ > 0 and x ∈ B(x, δ) K, a solution of (3), such
that
fi (x) − fi (x) < −αi x − x, ∀i ∈ M,
then, there exists (x − x) ∈ Rn solution of (4), such that
x∗i , x − x < 0, ∀x∗i ∈ ∂∗∗ fi (x), ∀i ∈ M.
This is precisely the ∂∗∗ -approximate pseudoconvexity of type I condition for
fi , i ∈ M at x.
References
1. Ansari, Q.H., Lee, G.M.: Nonsmooth vector optimization problems and Minty vec-
tor variational inequalities. J. Optim. Theory Appl. 145, 1–16 (2010)
2. Al-Homidan, S., Ansari, Q.H.: Generalized Minty vector variational like inequalities
and vector optimization problems. J. Optim. Theory Appl. 144, 1–11 (2010)
3. Bhatia, D., Gupta, A., Arora, P.: Optimality via generalized approximate convexity
and quasiefficiency. Optim. Lett. 7, 127–135 (2013)
4. Clarke, F.H.: Optimization and Nonsmooth Analysis. Wiley-Interscience, New
York (1983)
5. Daniilidis, A., Georgiev, P.: Approximate convexity and submonotonicity. J. Math.
Anal. Appl. 291, 292–301 (2004)
6. Deng, S.: On approximate solutions in convex vector optimization. SIAM J. Control
Optim. 35, 2128–2136 (1997)
7. Demyanov, V.F.: Convexification and Concavification of Positively Homogeneous
Functions by the Same Family of Linear Functions. Report 3.208.802 Universita
di Pisa (1994)
8. Demyanov, V.F., Jeyakumar, V.: Hunting for a smaller convex subdifferential. J.
Global Optim. 10, 305–326 (1997)
9. Dutta, J., Chandra, S.: Convexificators, generalized convexity and vector optimiza-
tion. Optimization 53, 77–94 (2004)
10. Dutta, J., Vetrivel, V.: On approximate minima in vector optimization. Numer.
Funct. Anal. Optim. 22, 845–859 (2001)
11. Giannessi, F.: Theorems of the alternative, quadratic programming and comple-
mentarily problems. In: Cottle, R.W., Giannessi, F., Lions, J.L. (eds.) Variational
Inequalities and Complementarity Problems, pp. 151–186. Wiley, New York (1980)
12. Giannessi, F.: On Minty variational principle. In: Giannessi, F., Komlósi, S.,
Rapcsák, T. (eds.) New Trends in Mathematical Programming, pp. 93–99. Kluwer
Academic Publishers, Dordrecht, Netherland (1997)
13. Golestani, M., Nobakhtian, S.: Convexificator and strong Kuhn-Tucker conditions.
Comput. Math. Appl. 64, 550–557 (2012)
14. Gupta, A., Mehra, A., Bhatia, D.: Approximate convexity in vector optimization.
Bull. Austral. Math. Soc. 74, 207–218 (2006)
15. Gupta, D., Mehra, A.: Two types of approximate saddle points. Numer. Funct.
Anal. Optim. 29, 532–550 (2008)
16. Jeyakumar, V., Luc, D.T.: Nonsmooth calculus, minimality, and monotonicity of
convexificators. J. Optim. Theory Appl. 101(3), 599–621 (1999)
17. Jeyakumar, V., Luc, D.T.: Approximate Jacobian matrices for nonsmooth contin-
uous maps and C 1 -optimization. SIAM J. Control Optim. 36, 1815–1832 (1998)
18. Li, X.F., Zhang, J.Z.: Stronger Kuhn-Tucker type conditions in nonsmooth multi-
objective optimization: locally Lipschitz case. J. Optim. Theory Appl. 127, 367–
388 (2005)
19. Long, X.J., Huang, N.J.: Optimality conditions for efficiency on nonsmooth mul-
tiobjective programming problems. Taiwanese J. Math. 18, 687–699 (2014)
20. Luu, D.V.: Convexifcators and necessary conditions for efficiency. Optimization
63, 321–335 (2013)
21. Mangasarian, O.L.: Nonlinear Programming. McGraw-Hill, New York (1969)
22. Michel, P., Penot, J.P.: A generalized derivative for calm and stable functions.
Differ. Integral Equ. 5, 433–454 (1992)
23. Mishra, S.K., Upadhyay, B.B.: Some relations between vector variational inequal-
ity problems and nonsmooth vector optimization problems using quasi efficiency.
Positivity 17, 1071–1083 (2013)
24. Mishra, S.K., Upadhyay, B.B.: Pseudolinear Functions and Optimization. Taylor
and Francis (2014)
25. Upadhyay, B.B., Mohapatra, R.N.: On approximate convex functions and sub-
monotone operators using convexificators. J. Nonlinear Convex Anal. (2018) (sub-
mitted)
26. Mishra, S.K., Wang, S.Y., Lai, K.K.: Generalized Convexity and Vector Optimiza-
tion. Nonconvex Optimization and Its Applications. Springer, Berlin (2009)
27. Mordukhovich, B.S., Shao, Y.H.: On nonconvex subdifferential calculus in Banach
spaces. J. Convex Anal. 2, 211–227 (1995)
28. Ngai, H.V., Luc, D.T., Thera, M.: Approximate convex functions. J. Nonlinear
Convex Anal. 1, 155–176 (2000)
29. Ngai, H.V., Penot, J.P.: Approximate convex functions and approximately mono-
tone operators. Nonlinear Anal. 66, 547–564 (2007)
30. Osuna-Gomez, R., Rufian-Lizana, A., Ruiz-Canales, P.: Invex functions and gen-
eralized convexity in multiobjective programming. J. Optim. Theory Appl. 98,
651–661 (1998)
31. Upadhyay, B.B., Mohapatra, R.N., Mishra, S.K.: On relationships between vec-
tor variational inequality and nonsmooth vector optimization problems via strict
minimizers. Adv. Nonlinear Var. Inequalities 20, 1–12 (2017)
32. Yang, X.M., Yang, X.Q., Teo, K.L.: Some remarks on the Minty vector variational
inequality. J. Optim. Theory Appl. 121, 193–201 (1994)
33. Yang, X.Q., Zheng, X.Y.: Approximate solutions and optimality conditions of vec-
tor variational inequalities in Banach spaces. J. Glob. Optim. 40, 455–462 (2008)
SOP-Hybrid: A Parallel Surrogate-Based
Candidate Search Algorithm
for Expensive Optimization on Large
Parallel Clusters
Taimoor Akhtar1(B) and Christine A. Shoemaker2,3

1
Environmental Research Institute, National University of Singapore,
erita@nus.edu.sg
2
Department of Industrial Systems Engineering and Management,
National University of Singapore, Singapore, Singapore
shoemaker@nus.edu.sg
3
Department of Civil and Environmental Engineering,
Abstract. Efficient parallel algorithm designs and surrogate models are

powerful tools that can significantly increase the efficiency of stochastic
metaheursitics with application to computationally expensive optimiza-
tion problems. This paper introduces SOP-Hybrid, a synchronous par-
allel surrogate-based global optimization algorithm, designed for com-
putationally expensive problems. SOP-Hybrid is a modification of the
Surrogate Optimization with Pareto center selection (SOP) algorithm,
designed to achieve better synchronous parallel optimization efficiency
when a large number of cores are available. The original SOP was built
on the idea of visualizing the exploration-exploitation trade-off of itera-
tive surrogate optimization as a multi-objective problem, and was exper-
imentally effective for up to 32 processors. SOP-Hybrid modifies SOP by
visualizing the exploration-exploitation trade-off at two levels, i.e. (i) at
the global level and as a multi-objective problem (like SOP) and (ii) at
the local level, via an acquisition function. Both SOP and SOP-Hybrid
use Radial Basis Functions (RBFs) as surrogates. Results on test prob-
lems indicate that SOP-Hybrid is more efficient than SOP with 48 simul-
taneous synchronous evaluations. SOP was previously shown to be more
efficient than Parallel Stochastic RBF and ESGRBF with 32 simultane-
ous synchronous evaluations.
Keywords: Expensive functions · Meta-models · Parallel optimization
This work was partially supported by the Singapore National Research Foundation,
Prime Minister’s Office, Singapore under its Campus for Research Excellence and Tech-
nological Enterprise (CREATE) programme (E2S2-CREATE project CS-B) and by
Prof. Shoemaker’s NUS startup grant.
https://doi.org/10.1007/978-3-030-21803-4_67
Parallel Surrogate-Based Candidate Search 673
1 Introduction
There is a growing need for global optimization methods for expensive objective
functions, especially those based on computer simulation models. The monumen-
tal increase in computational power over the last few decades has also resulted
in a massive increase in complexity and computational intensity of simulation
models. For instance, state-of-the-art environmental and hydrodynamic simu-
lation models solve Partial Differential Equation (PDE) systems that require
considerable computational time and resources. A single training run of a Deep
Neural Networks (DNNs) can take hours, or even days on a GPU. Global opti-
mization of continuous and computationally expensive black-box functions that
are derived from such models, is thus, extremely challenging [8].
Deterministic programming methods [13] and stochastics metaheuristics are
the two prevalent classes of algorithms that have been used in the past for
expensive global optimization. A recent comprehensive algorithm comparison
[12] showed that both deterministic and stochastic algorithm classes are com-
petitive, with stochastic methods performing better on smaller evaluation bud-
gets, and deterministic methods performing better on relatively larger evaluation
budgets.
When the evaluation budget is limited, optimization efficiency of stochas-
tic algorithms can be enhanced significantly (i) by using cheap approximation
functions as surrogate models during optimization and (ii) incorporating par-
allelization by proposing multiple points for expensive evaluations during the
iterative stochastic optimization process.
Prior work on the use of surrogates for expensive optimization is dominated
by iterative algorithmic frameworks where, in each algorithm iteration, surro-
gates are (i) used to propose new points for evaluation and (ii) subsequently
updated after new points are evaluated. This general framework is also called
Sequential Model-based Optimization (SMBO) [4]. Gaussian Processes (GP)
have been the most popular choice as surrogates for the SMBO framework [6,14].
However, many recent studies show that Radial Basis Functions (RBFs) can be
more effective as surrogates, than GPs, for the SMBO framework [5,9,11].
A critical component of any SMBO algorithm is the mechanism for proposing
new points for expensive evaluations. If such mechanism allows multiple points
to be proposed for evaluation in each algorithm iteration, expensive evaluations
of the multiple points can be executed in parallel. Parallel variants of both GP
[14] and RBF-based algorithms [7,10] have been proposed in the past, with the
focus on balancing the trade-off between exploration and exploitation during the
process of proposing multiple points for expensive evaluations. However, most
of these algorithms have been applied to situations where only a few points are
proposed for evaluation in each algorithm iteration. For instance, Snoek et al.
[14] test their GP-based algorithm with up to only 10 parallel evaluations.
Given the availability of many cores on large computing clusters, it is impor-
tant to explore the efficiency of surrogate algorithms when many points are
proposed for parallel evaluation in each algorithm iteration. This study explores
the efficiency of the SOP algorithm [7] in this regard, and subsequently proposes
674 T. Akhtar and C. A. Shoemaker
a modified version of SOP called SOP-Hybrid, that is designed for parallel opti-
mization of expensive functions with availability of many cores (approximately
50 or more).
2 The SOP Algorithm

2.1 Synchronous Model-Based Optimization Framework
Our proposed algorithm “SOP-Hybrid” uses elements of a prior algorithm SOP

published in [7]. Surrogate Optimization with Pareto center selection (SOP)
follows the general synchronous model-based optimization framework depicted
in Algorithm 1 below. As is evident in Algorithm 1, SOP’s general framework
has three core components, i.e., experimental design (Step 1), choice of surrogate
model (step 4) and methodology for selection of new points (step 5). SOP uses
Symmetric Latin Hypercube as the initial design methodology (step 1)and RBFs
(in step 2) as surrogates. Step 5 (in Algorithm 1 below) is where SOP’s design
is unique and Sects. 2.2 and 2.3 explain this step in more detail.
Algorithm 1 Synchronous Model-based Optimization Framework

1: Generate initial points via experimental design
2: Evaluate the initial points (in parallel)
3: while Evaluation Budget not exceeded do
4: Fit/update surrogate model, given evaluated points
5: Select new points for evaluation, using surrogate model
6: Evaluate new points (synchronously)
7: end while
2.2 A Multi-objective View of the Exploration-Exploitation

Tradeoff
Selection of new points for expensive evaluations in SMBO frameworks, typi-

cally involves a methodology that attempts to balance between exploration and
exploitation, where exploration implies selection of points from previously unex-
plored regions of the decision space, and exploitation implies selection based on
surrogate approximation. The Expected Improvement acquisition function, for
instance, is designed (for GPs only) to balance this trade-off.
SOP visualizes the exploration-exploitation trade-off as a multi-objective
problem in itself, by ranking all previously evaluated points (via non-dominated
sorting [1]), as per the following multi-objective formulation:
min [F1 (x), F2 (x)], (1)

x∈S (n)
where F1 (x) = f (x) and F2 (x) = − mins∈S (n) \{x} s − x. S (n) is the set of
all previously evaluated n points. F1 (x) denotes the expensive objective func-
tion value and F2 (x) is the minimum distance of an evaluated point from other
evaluated points, and is hence a metric of the isolation of an evaluated point.
After evaluated points are ranked (via non-dominated sorting as per Eq. 1), P
points are selected (these points are also called centers) in SOP for neighborhood
candidate search (see Sect. 2.3 below). Selection of center points according to
Eq. 1 is essentially based on the exploration-exploitation trade-off [6], where
F1 (x) implies exploitation (since we will subsequently perform candidate search
around better solutions found so far) and F2 (x) implies exploration.
2.3 Candidate Search - DYCORS
After P center points are selected, the SOP algorithm proceeds by generating
P sets of candidate points (large). One candidate set is generated around each
center, i, by randomly perturbing a subset of decision variables around center i.
This candidate generation mechanism is also referred as Dynamic Co-ordinate
Search (DYCORS), which was first proposed in [11]. Subsequently, one point
is selected for expensive evaluations from each candidate set. This is the point
with the best surrogate approximation value. Hence, SOP proposes P new points
(Step 5 in Algorithm 1) for simultaneous evaluation in each iteration.
3 SOP-Hybrid Framework
In this section we propose SOP-Hybrid, which is designed to be more efficient

for higher number of parallel (sychronous) evaluations. Krityakierne et al. [7]
show that SOP scales well on test problems with up to 32 centers/evaluations
per synchronous iteration. However, it may be possible to further scale the par-
allelization of SOP by introducing a strategy for proposing multiple points for
expensive evaluation around each center point. We propose the SOP-Hybrid
framework to explore this idea.
The SOP-Hybrid framework is depicted in the flowchart of Fig. 1. As is evi-
dent from Fig. 1, N ∗ P new points are proposed for expensive evaluation in each
iteration of SOP-Hybrid. Each iteration of SOP-Hybrid starts with update of
the RBF surrogate with N ∗ P new points. Subsequently, P centers are selected
(as in SOP), and N new points each are selected, per center, for simultaneous
evaluation.
3.1 Selection - Acquisition Functions
SOP-Hybrid uses the weighted distance acquisition function introduced in [9] to

select N points, per center, for evaluation, from amongst the candidate points
generated around the corresponding center point. For a given weight W ∈ [0, 1],
the weighted distance merit is defined as:
Fig. 1. General framework (sychronous parallel) of the SOP-Hybrid algorithm.
wV s (x) + (1 − x)V D (x) (2)

s(x)−smin
where V s (x) = smax −smin is the normalized surrogate score for all candidates
D Δ(x)−Δmin
and V (x) = Δmax −Δmin is the normalized distance score. Δ(x) is the distance
between x and the point in the set of evaluated points that is closest to x. SOP-
Hybrid is thus, trying to balance between the exploration-exploitation trade-
off at two levels. At the global level, the exploration-exploitation trade-off is
maintained via Eq. 1 and the center selection methodology proposed in SOP
by [7]. At the local neighborhood level, the exploration-exploitation trade-off
is maintained via the weighted distance acquisition function (defined in Eq. 2)
proposed by [9,10].
4 Computer Experiments
4.1 Experimental Setup
Algorithm Implementation - pySOT Toolbox SOP and SOP-Hybrid are

implemented in the pySOT optimization toolbox [2].
Algorithm Configuration and Evaluation Budget The purpose of our

computer experiments is to assess if the modifications to SOP that are pro-
posed in this study to create SOP-Hybrid, lead to an improvement in algorithm
efficiency. And especially, when many points are proposed for synchronous eval-
uations in each iteration. Consequently we run both SOP and SOP-Hybrid with
48 evaluations per iteration each. In case of SOP-Hybrid, we fix the number of

centers (i.e., P - see Fig. 1) to 4 and the number of points selected per center
(i.e., N) to 12. Moreover, both SOP and SOP-Hybrid are run for 60 algorithm
iterations, and start with 22 initial points each. Also, since both algorithms
are stochastic, we run 30 trials each, for each algorithm on each test problem.
Parallel performance of SOP was compared against other surrogate algorithms
in [7], with SOP performing better with up to 32 simultaneous evaluations, on
numerous test problems. Hence, we have not compared SOP-Hybrid to other
synchronous surrogate algorithms.
Test Problems SOP and SOP-Hybrid are tested on six noiseless BBOB bench-
mark [3] problems (F15, F16, F19, F20, F23, F24). These test problems are
chosen, since they are highly multi-modal and are frequently used in global opti-
mization competitions. Moreover, F20, F23 and F24 have weak global structures,
and hence, pose an additional optimization challenge. The number of decision
variables for all test problems are set to ten.
Fig. 2. Progress curves (low curves are better) of SOP and SOP-Hybrid with 48 syn-
chronous simultaneous evaluations, for Problems F15 (left sub-plot) and F16 (right
sub-plot)
Progress Curves Performance of SOP and SOP-Hybrid (with 48 points eval-

uated in each iteration) is compared via progress curves. Each progress curve
plots the difference (i.e., absolute error) between the best objective value (fbest )
obtained by an algorithm so far (i.e., after the number of evaluations depicted
on the horizontal axis) and the global optimum (fopt ) (averaged over multiple
trials), against the number of function evaluations. Consequently, lower progress
curves are best. Moreover, the absolute difference, i.e., |fbest − fopt | is plotted on
a log-scale in all progress curves plots, to easily visualize the difference between
progress efficiency of both algorithms.
4.2 Results
The progress curves of SOP and SOP-Hybrid for test problems F15 and F16 are
illustrated in Fig. 2. Results of Fig. 2 indicate that both SOP and SOP-Hybrid
have comparable performance on Problems F15 and F16. However, performance
of SOP-Hybrid is slightly better.
sub-plot)
The difference in performance of SOP and SOP-Hybrid is more evident in

the results of problems F19 and F20, as visualized in Fig. 3. It is evident in Fig. 3
that SOP-Hybrid converges faster than SOP for both F19 and F20, after 1000
function evaluations have elapsed. Moreover, SOP-Hybrid attains the best solu-
tion (averaged over multiple trials) obtained by SOP after approximately 2700
function evaluations, within approximately 1800 function evaluations, i.e., only
within 70% evaluations required by SOP. This implies that SOP-Hybrid’s strat-
egy of proposing multiple points around each center via balancing exploration
and exploitation within the candidate search in the neighborhood of a center,
works well when many evaluations can be proposed in parallel.
Comparative results on Problems F23 and F24 (see Fig. 4) are similar to
the results observed for problems F19 and F20. SOP-Hybrid has a faster rate
of convergence than SOP, when 48 points are proposed for evaluation in each
algorithm iteration. Overall, performance of SOP-Hybrid is better than SOP,
when many points can be proposed for synchronous parallel evaluation in each
algorithm iteration.
5 Conclusion
SOP-Hybrid is an extension of the synchronous parallel SOP algorithm, that is
designed for computationally expensive continuous black-box functions, and par-
allel frameworks where computational resources are available for simultaneously
sub-plot)
evaluating many (more than 40) expensive points in each algorithm iteration.
SOP-Hybrid incorporates the weighted distance acquisition function proposed in
[9,10], into the SOP framework, to balance the exploration-exploitation trade-off
both at the global level (via center selection) and the local level (via DYCORS
candidate search).
Results, with 48 points proposed for synchronous evaluations, show that
SOP-Hybrid is more efficient than the baseline SOP algorithm. In future, we
wish to extend our comparative analysis of SOP and SOP-Hybrid, by testing
both algorithms on test problems with up to 200 simultaneous evaluations per
iteration, and on higher dimensional problems. Moreover, we intend to compare
both algorithms on some real simulation-optimization applications.
References
1. Deb, K., Kalyanmoy, D.: Multi-Objective Optimization Using Evolutionary Algo-
rithms. Wiley, New York (2001)
2. Eriksson, D., Bindel, D., Shoemaker, C.: Surrogate optimization toolbox (pysot).
https://github.com/dme65/pySOT (2015)
3. Hansen, N., Finck, S., Ros, R., Auger, A.: Real-parameter black-box optimization
benchmarking 2009: Noiseless functions definitions. Technical Report RR-6829,
INRIA (2009)
4. Hutter, F., Hoos, H.H., Leyton-Brown, K.: Sequential model-based optimization
for general algorithm configuration. In: Proceedings of the 5th International Con-
ference on Learning and Intelligent Optimization, pp. 507–523. Springer-Verlag,
Heidelberg (2011)
5. Ilievski, I., Akhtar, T., Feng, J., Shoemaker, C.: Efficient hyperparameter optimiza-
tion for deep learning algorithms using deterministic RBF surrogates. In: AAAI
Conference on Artificial Intelligence (2017)
6. Jones, D.R., Schonlau, M., Welch, W.J.: Efficient global optimization of expensive
black-box functions. J. Global Optim. 13(4), 455–492 (1998)
7. Krityakierne, T., Akhtar, T., Shoemaker, C.A.: SOP: parallel surrogate global opti-
mization with pareto center selection for computationally expensive single objective
problems. J. Global Optim. 66(3), 417–437 (2016)
8. Pintér, J.D.: Global Optimization in Action. Springer, New York (1996)
9. Regis, R.G., Shoemaker, C.A.: A stochastic radial basis function method for the
global optimization of expensive functions. INFORMS J. Comput. 19(4), 497–509
(2007)
10. Regis, R.G., Shoemaker, C.A.: Parallel stochastic global optimization using radial
basis functions. INFORMS J. Comput. 21(3), 411–426 (2009)
11. Regis, R.G., Shoemaker, C.A.: Combining radial basis function surrogates dynamic
coordinate search in high dimensional expensive black-box optimization. Eng.
Optim. 45(5), 529–555 (2013)
12. Sergeyev, Y.D., Kvasov, D.E., Mukhametzhanov, M.S.: On the efficiency of nature-
inspired metaheuristics in expensive global optimization with limited budget. Sci.
Rep. 8(453), (Jan 2018)
13. Sergeyev, Y.D., Kvasov, D.E.: Deterministic Global Optimization: An Introduction
to the Diagonal Approach, 1st edn. Springer, New York (2017)
14. Snoek, J., Larochelle, H., Adams, R.P.: Practical Bayesian optimization of machine
learning algorithms. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q.
(eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 2951–2959.
Curran Associates, Inc. (2012)
Surrogate Many Objective Optimization:
Combining Evolutionary Search,
-Dominance and Connected Restarts
Taimoor Akhtar1(B) , Christine A. Shoemaker2,3 , and Wenyu Wang2

1
Environmental Research Institute, National University of Singapore,
erita@nus.edu.sg
2
Department of Industrial Systems Engineering and Management,
wenyu wang@u.nus.edu
3
Department of Civil and Environmental Engineering,
Abstract. Scaling multi-objective optimization (MOO) algorithms to

handle many objectives is a significant computational challenge. This
challenge exacerbates when the underlying objectives are computation-
ally expensive, and solutions are desired within a limited number of
expensive objective evaluations. A surrogate model-based optimization
framework can be effective in MOO. However, most prior model-based
algorithms are effective for 2–3 objectives. This study investigates the
combined use of -dominance, connected restarts and evolutionary search
for efficient Many-objective optimization (MaOO). We built upon an
existing surrogate-based evolutionary algorithm, GOMORS, and propose
-GOMORS, i.e., a surrogate-based iterative evolutionary algorithm that
combines Radial Basis Functions and -dominance-based evolutionary
search, to propose new points for expensive evaluations in each algorithm
iteration. Moreover, a novel connected restart mechanism is introduced
to ensure that the optimization search does not get stuck in locally opti-
mum fronts. -GOMORS is applied to a few benchmark multi-objective
problems and a watershed calibration problem, and compared against
GOMORS, ParEGO, NSGA-III, Borg, -NSGA-II and MOEA/D on a
limited budget of 1000 evaluations. Results indicate that -GOMORS
converges more quickly than other algorithms and the variance of its
performance across multiple trials, is also less than other algorithms.
Keywords: Expensive optimization · Many objectives · Meta-models
This work was partially supported by the Singapore National Research Foundation,
Prime Minister’s Office, Singapore under its Campus for Research Excellence and Tech-
nological Enterprise (CREATE) programme (E2S2-CREATE project CS-B) and by
Prof. Shoemaker’s NUS startup grant.
https://doi.org/10.1007/978-3-030-21803-4_68
682 T. Akhtar et al.
1 Introduction
Many real-world optimization problems are multi-objective, where evaluation
of objectives is computationally expensive. Multi-Objective Evolutionary Algo-
rithms (MOEAs) are extremely popular for solving computationally expensive
multi-objective problems, since their population-based structure allows MOEAs
to converge to the Pareto front and simultaneously find diverse trade-off solu-
tions [5].
Despite their inherent capability of simultaneously pursuing convergence and
diversity, MOEAs may still require many expensive simulations to find suitable
trade-off solutions, especially for Many-objective Optimization (MaOO) prob-
lems [1,2]. Iterative use of surrogate models in optimization can significantly
reduce computational effort for expensive MO problems.
The taxonomy of iterative surrogate multi-objective optimization is discussed
in [10]. Many iterative surrogate algorithms have been proposed in past literature
for expensive multi-objective optimization, and are dominated by methods that
either use Gaussian Processes (GP) [6,7,11] or Radial Basis Functions (RBFs)
[1,14] as surrogates. However, most surrogate methods introduced in the past
are only designed for and tested on problems with up to 3 objectives.
Since many real world optimization application can have many objec-
tives (more than three), this study proposes -GOMORS, an extension of the
GOMORS algorithm [1], that is designed to handle many objectives. GOMORS
is an iterative surrogate MO algorithm that uses RBFs as surrogates and has
performs better than the GP-based ParEGO [11] on a limited evaluation budget
and especially on problems with more than 10 decision variables.
-GOMORS replaces the use of non-dominance archiving in GOMORS by
-non-dominance archiving introduced in [13]. This -non-dominance archiving
mechanism is a computationally efficient alternative to non-dominance archiving
and has been used in some non-surrogate MOEAs to improve algorithm run-time
efficiency and scale performance on many-objective problems [9,12].
An additional challenge associated with Multi-objective algorithms is that
they can get stuck in locally optimum solutions and fronts, especially for prob-
lems with multi-modal objectives. Restart mechanisms have been used to alle-
viate this challenge in the past, especially in MOEAs [9,12] and single objective
surrogate algorithms [15,16]. -GOMORS also incorporates a novel restart mech-
anism to ensure that the algorithm does not get stuck in locally optimum fronts.
2 The -GOMORS Algorithm

2.1 The Iterative Surrogate Optimization Framework
The general framework of iterative Multi-Objective optimization with surro-
gates, as defined in [10], has three core components within the iterative loop
(i.e., after algorithm initialization), namely, (i) methodology for fitting surrogate
model(s), (ii) generating candidate solutions using surrogates and (iii) selecting
points for expensive evaluations from candidate solutions.
Surrogate Many Objective Optimization 683
2.2 The GOMORS Algorithm

The -GOMORS algorithm introduced in this study is an extension of the surro-
gate multi-objective algorithm GOMORS [1]. GOMORS follows the iterative sur-
rogate framework discussed in Sect. 2.1, and uses Radial Basis Functions (RBFs)
as surrogates. One RBF surrogate is fitted for each expensive objective. Hence,
assuming that F (x) = [f1 (x), . . . , fk (k)] is the set of k expensive objectives being
solved in our MOO, Fm (x) = [fm,1 , . . . , fm,k ] is the set of inexpensive surrogate
functions fitted on the m points expensively evaluated so far.
During the iterative loop of GOMORS, two auxiliary problems are solved, i.e.,
the Global surrogate problem (defined in Eq. 1) and the Gap optimization
problem (defined in Eq. 2. Equations 1 and 2 define the two auxiliary problems,
where xL and xU are the lower and upper bounds of the original MOO problem
being solved, and xcrowd is the least crowded (as per crowding distance [5]) eval-
uated solution amongst the non-dominated evaluated solutions. r is a vector that
defines the neighborhood of xcrowd . Hence, the Gap optimization problem is
a search in the neighborhood of xcrowd .
minimize: Fm (x) = [fm,1 (x), . . . , fm,k (x)]T

(1)
subject to xL ≤ x ≤ xU
minimize: Fm (x) = [fm,1 (x), . . . , fm,k (x)]
(2)
subject to: (xcrowd − r) ≤ x ≤ (xcrowd + r)
NSGA-II is used in GOMORS, as the embedded algorithm for solving the
auxiliary problems of Eqs. 1 and 2. Moreover, for solving the Global surrogate
problem of Eq. 1, the non-dominated archive (of expensively evaluated points)
is injected into the initial population of NSGA-II. Two candidate populations are
consequently generated. Let PA be the final population generated after solving
the Global surrogate problem and let PB be the final population generated
after solving the Gap optimization problem. GOMORS then uses multiple
rules [1] to select multiple new points, from PA and PB , for expensive evaluations
in each algorithm iteration.
2.3 The -GOMORS Framework

Figure 1 provides an overview of the iterative framework of -GOMORS. As is
depicted in Fig. 1 the algorithmic framework of -GOMORS is very similar to
GOMORS. Each iterative loop of -GOMORS starts with fitting of RBF surro-
gates (on each objective). The algorithm continues by independently solving the
two auxiliary problems discussed in Sect. 2.2. Multiple points are then selected
for expensive evaluations from the two populations, i.e., PA and PB .
2.4 -Non-Dominance Archiving and -NSGA-II

In order to handle problems with many objectives, -GOMORS maintains an -
non-dominance archive (introduced by [13]) instead of a non-dominance
Update -Progress Restart with Random Design

-ND Archive Restart? + -ND Archive
Fit RBF Surrogate
Surrogate-Assisted Evolutionary Search

Apply -NSGA-II on
Select 3 New Points from Inject -ND Archive
Global Surrogate
Population PA and And Random Points in
Problem to Evolve
Evaluate Initial Population: PA
Population: PA
Select 1 New Points from Apply -NSGA-II on Gap

Population PB and Problem to Generate
Evaluate Population: PB
Fig. 1. General algorithm framework of -GOMORS.
archive (maintained in GOMORS). Let F (x) = [f1 (x), . . . , fk (k)] be the k

expensive objectives being solved in our MOO, where x ∈ D:
Definition 1. An objective vector y = [y1 , . . . , yk ] dominates (i.e., y ≺ z)

another vector z = [z1 , . . . , zk ] if and only if yi ≤ zi for all 1 ≤ i ≤ k, and yj < zj
for some 1 ≤ j ≤ k.
Definition 2. Given a set of solutions S = {x | x ∈ D} , a subset (archive) of

solutions S † ⊂ S is non-dominated in S if there does not exist a solution x ∈ S
which dominates x† ∈ S † , i.e, S † = {x† ∈ S | ∃x ∈ S, F (x) ≺ F (x† )}.
Definition 3. Given an > 0, an objective vector y = [y1 , . . . , yk ] -box dom-

inates (i.e., y ≺ z) another vector z = [z1 , . . . , zk ] if and only if (i) y ≺ z
OR (ii) y = z and ||y − y || < ||z − z ||. . is the floor function.
Definition 4. Given a set of evaluated solutions S = {x | x ∈ D}, a subset

(archive) S ∗ ⊂ S is -non-dominated in S if there does not exist a solution
x ∈ S which -box dominates x∗ ∈ S ∗ , i.e, S ∗ = {x∗ ∈ S | ∃x ∈ S, F (x) ≺ F (x∗ )}.
The -box dominance concept used in -non-dominance archiving (see Def-

initions 3 and 4) essentially divides the objective space into hyperboxes with
box dimensions defined by the vector = [1 , . . . , k ]. Each objective solution,
y, resides in a hyperbox, where the lower left corner of that box (also called
box-value), denoted by y = [ y11 , . . . , ykk ], represents the objective value of
that box. a is the greatest integer less than or equal to a, i.e., . is the floor
function. Consequently, an objective vector y -box dominates vector z if (i) the
box-value of y dominates the box-value of z or (ii) if both y and z are in the
same box (i.e., have the same box-value) but y is closer to the lower left corner
of the box, than z. The vector, = [1 , . . . , k ], is user-defined.
A core advantage of -non-dominance archiving is that it is computationally
efficient for many objectives and hence, is more feasible than non-dominance
archiving [9,12]. If the expensive optimization problem has many objectives, the
auxiliary problems of Eqs. 1 and 2 will also have many objectives. Hence, solving
the auxiliary problems requires an algorithm that is suitable for many-objective
problems. -GOMORS thus uses -NSGA-II [12] as the auxiliary solver instead
of NSGA-II (see Fig. 1). Moreover, for the Gap problem of Eq. 2, a solution is
randomly selected from the -non-dominance archive, as xcrowd .
2.5 Connected Restarts

The -GOMORS algorithm also incorporates a novel restart mechanism (see
Fig. 1). Restarts have been widely used in optimization algorithms in the past,
to re-initialize the optimization search if it stagnates (e.g., if the algorithm gets
stuck in a local optima) [9,12,15].
The restart mechanism of -GOMORS has two levels. The first restart level is
triggered if the -non-dominance archive does not “improve” for a few algorithm
iterations. Improvement is assessed via the -progress metric introduced in [9]. At
the first restart level, the algorithm restarts with a new Symmetric Latin Hyper-
cube (LHS) design plus the -non-dominant solutions from the previous start.
The first restart level is called ‘connected restart’, and the purpose of this restart
is to simultaneously inject new random solutions into the search (exploration),
and retain the best solutions found so far (exploitation and elitism). After a few
‘connected restarts’ are registered, the second restart level is invoked. At this
level, the algorithm restarts with only a new random initial design (Symmetric
LHS). Hence, this restart level is called ‘independent restart’.
3 Experiments and Results

3.1 Experimental Setup
Test Problems -GOMORS is designed to be efficient for many-objective prob-
lems. Hence, we test its performance on two widely used test problems, namely
DTLZ2 and DTLZ4, that are scalable in the number of objectives, and were
proposed in [4]. An additional challenge associated with DTLZ4 is that it has a
non-uniform distribution of points on the Pareto front. We compare performance
of -GOMORS and other algorithms designed for many-objective optimization,
on DTLZ2 and DTLZ4 with 2, 4 and 6 objectives. The number of decision vari-
ables for both test problems are set to nobj + 9 (as per the recommendations
given in [4]), where nobj is the number of objectives.
Cannonsville Watershed Calibration Problem We also test -GOMORS

on Multi-Objective calibration of the SWAT Cannonsville watershed model
developed by Tolson and Shoemaker [18]. We calibrate 15 hydrologic parameters
of the Cannonsville watershed model in this study by formulating the calibration
as a bi-objective global optimization problem (unconstrained). The model is cal-
ibrated on a 10 year historical flow time-series data set (obtained from United
States Geological Survey (USGS) Station 01421618), and the two calibration
objectives represent different errors between simulated and observed data. Run-
ning time of a 10-year simulation of the Cannonsville model is around 1 min. The
watershed calibration problem is called ‘CFLOW’ in subsequent discussions.
Alternate Algorithms Performance of -GOMORS is compared against

numerous surrogate-based and non-surrogate (mostly evolutionary) algorithms.
For the scalable DTLZ problems [4], performance of -GOMORS is com-
pared against two MOEAs designed for many-objective optimization, -NSGA-
II [12] and NSGA-III [3]. For the watershed calibration problem we compare
-GOMORS with the surrogate algorithms ParEGO [11] and GOMORS [1] and
the evolutionary algorithms MOEA/D [19] and Borg [9] (these algorithms have
been applied to water resources problems in the past).
Performance Assessment Methods This study focuses on multi-objective

optimization of computationally expensive functions. Hence we limit all opti-
mization experiments to a budget of 1000 function evaluations. Moreover, since
all algorithms compared in this study are stochastic, multiple optimization tri-
als (10 trials) are run for each algorithm on each test problem. Hypervolume
coverage, Hc , is used as the metric for assessing multi-objective performance of
an algorithm. The Hypervolume coverage is defined as follows:
Hv (P ) − Hv (Pinit )
Hc (P ) = (3)
Hv (P ∗ ) − Hv (Pinit )
Let P be the set of non-dominated solutions obtained by an algorithm and let
P ∗ be the Pareto front of the multi-objective problem being solved. Moreover, let
Hv (A) be the Hypervolume [6] of the objective space dominated by an arbitrary
set A. Consequently, Hc (P ) is the proportion of total feasible objective space
(after subtracting the space dominated by initial solutions, i.e., Pinit ) dominated
by P . Higher values of Hc are desirable and ideal value is 1.
Fig. 2. DTLZ2 Progress Plots: Hypervolume coverage progress curves (averaged over
multiple trials) of all algorithms for DTLZ2 [4], with 2, 4 and 6 objectives. Each subplot
corresponds to a Hypervolume progress plot (higher values are better) comparison for
a fixed number of objectives (depicted in subplot title).
3.2 Results
Progress Curves - Many Objective Test Problems Results for the two
scalable test problems, DTLZ2 and DTLZ4 [4] are compared by plotting the
Hypervolume coverage (Hc ) obtained by an algorithm against the number of
completed function evaluations. We call these plots progress curves in subsequent
discussions. Figures 2 and 3 illustrate the progress curves for DTLZ2 and DTLZ4,
respectively. -GOMORS is labeled as “eps-GOMORS” and -NSGA-II is labeled
as “EpsNSGA2” in Figs. 2 and 3.
Fig. 3. DTLZ4 Progress Plots: Hypervolume coverage progress curves (averaged over
multiple trials) of all algorithms for DTLZ4 [4], with 2, 4 and 6 objectives. Each subplot
corresponds to a Hypervolume progress plot (higher values are better) comparison for
a fixed number of objectives (depicted in subplot title).
Figure 2 compares the progress curves of -GOMORS, -NSGA-II and NSGA-

III for the DTLZ2 test problem with 2, 4 and 6 objectives (sub-figures A, B and
C, respectively), and with a budget of 1000 function evaluations each. Figure 2
clearly indicates that -GOMORS is the fastest converging of all three algo-
rithms, since the curves for -GOMORS are highest for all three DTLZ2 variants.
Results for the DTLZ4 test case (see Fig. 3) are similar, and performance of
-GOMORS is significantly better than the other algorithms after 1000 function
evaluations. This is true for all DTLZ4 variants, i.e., the 2-objective, the 4-
objective and the 6-objective case. Overall, results of both test problems indicate
that, on a limited function evaluations budget, performance of -GOMORS is
better than -NSGA-II and NSGA-III, for 2, 4 and 6 objectives. Hence, our
results indicate that -GOMORS is effective for multi-objective optimization
(and also for problems with many objectives), when function evaluations are
expensive and the evaluation budget is limited (less than 1000).
Results-Watershed Calibration Problem The Hypervolume Coverage (Hc )

progress curves of -GOMORS, ParEGO, Borg and MOEA/D, for the water-
shed calibration problem, i.e., CFLOW (see Sect. 3.1 for problem definition), are
illustrated in Fig. 4. Please note that since the Pareto front (P ∗ in Eq. 3) for
Fig. 4. Watershed Porblem Progress Plots: Hypervolume coverage progress curves

(high curves are better) of all algorithms (averaged over multiple trials) for the bi-
objective Cannonsville Watershed calibration problem.
CFLOW is not known, it is estimated by consolidating the non-dominated solu-

tions obtained from all optimization experiments (these include additional trials
with more than 1000 evaluations). Figure 4 shows that -GOMORS is clearly the
most efficient amongst all algorithms compared, for a budget of 1000 function
evaluations. Moreover, efficiency of -GOMORS is such that -GOMORS, in less
than 200 function evaluations, achieves the Hypervolume coverage attained by
ParEGO (the next best algorithm) after 1000 evaluations. This essentially means
that -GOMORS is 5 times faster than ParEGO when evaluations are limited
to 1000.
A key difference between GOMORS and -GOMORS is the connected
restarts mechanism that has been introduced in -GOMORS. A core purpose
of introducing connected restarts in -GOMORS is to ensure that the algorithm
does not get stuck in locally optimum fronts. Figure 5 provides an illustration of
the effect of -GOMORS’ restart mechanism, by plotting the non-domination
fronts of the best and worst solutions (according to Hypervolume coverage)
obtained by -GOMORS across multiple trials (see Fig. 5-A). Figures 5-B and
5-C plot the best and worst non-dominated fronts of GOMORS and ParEGO,
respectively.
Figure 5 illustrates that the difference between the best and worst non-
dominated fronts (multiple trials) for -GOMORS (see Fig. 5A) is considerably
less than the corresponding difference for GOMORS, for the CFLOW calibra-
tion problem. Since the objectives for the CFLOW problem are multi-modal [17],
the better performance of -GOMORS (relative to GOMORS) across multiple
optimization trials, may be attributed to the restart mechanism introduced in -
GOMORS. Figure 5 also shows that performance of -GOMORS is significantly
better than ParEGO (see Fig. 5-C) in terms of converging to the estimated
Pareto front.
Fig. 5. Watershed Problem - Non-Dominated Fronts: Visualizations of best and

worst Non-Dominated (ND) fronts (lower fronts are better) obtained by (A) eps-
GOMORS, (B) GOMORS and (C) ParEGO, for CFLOW watershed problem, after
1000 evaluations; compared against estimated true front.
4 Conclusion
-GOMORS is a novel extension of the surrogate MO algorithm GOMORS [1],
that incorporates -dominance and connected restarts, to handle many-objective
optimization problems. A restart mechanism is introduced in -GOMORS to
ensure that the algorithm does not get stuck in locally optimum trade-off
solutions.
Results of -GOMORS on two many-objective test problems are promising,
indicating that the -dominance concept allows the algorithm to scale well for
up to six objectives. Moreover, results on many-objective test problems also show
that -GOMORS is more efficient than other state-of-the-art many-objective
evolutionary (non-surrogate) algorithms, -NSGA-II and NSGA-III, when eval-
uation budget is limited to 1000.
When applied to a watershed calibration problem, -GOMORS is more reli-
able than GOMORS (i.e., the variance of -GOMORS’ performance across mul-
tiple optimization trials is less), and considerably more efficient that other surro-
gate (e.g., the Gaussian Process-based ParEGO) and non-surrogate algorithms it
is compared against. In future, we intend to test -GOMORS with different sur-
rogate taxonomies [2] to further improve efficiency for surrogate many objective
optimization. Python implementation of -GOMORS is available upon request,
and will be made available online in future, as part of the pySOT toolbox [8].
References
1. Akhtar, T., Shoemaker, C.A.: Multi objective optimization of computationally
expensive multi-modal functions with RBF surrogates and multi-rule selection.
J. Global Optim. 64(1), 17–32 (2016)
2. Deb, K., Hussein, R., Roy, P.C., Toscano, G.: A taxonomy for metamodeling frame-
works for evolutionary multi-objective optimization. IEEE Trans. Evol. Comput.
1–1 (2018)
3. Deb, K., Jain, H.: An evolutionary many-objective optimization algorithm using

reference-point-based nondominated sorting approach, part i: solving problems
with box constraints. IEEE Trans. Evol. Comput. 18(4), 577–601 (2014)
4. Deb, K., Thiele, L., Laumanns, M., Zitzler, E.: Scalable multi-objective optimiza-
tion test problems. In: Proceedings of the 2002 Congress on Evolutionary Compu-
tation, CEC 2002 (Cat. No.02TH8600), vol. 1, pp. 825–830, May 2002
5. Deb, K., Kalyanmoy, D.: Multi-Objective Optimization Using Evolutionary Algo-
rithms, 1 edn. Wiley (2001)
6. Emmerich, M.T.M., Deutz, A.H., Klinkenberg, J.W.: Hypervolume-based expected
improvement: monotonicity properties and exact computation. In: 2011 IEEE
Congress of Evolutionary Computation (CEC), pp. 2147–2154, June 2011
7. Emmerich, M., Yang, K., Deutz, A., Wang, H., Fonseca, C.M.: A Multicriteria Gen-
eralization of Bayesian Global Optimization, pp. 229–242. Springer International
Publishing, Cham (2016)
8. Eriksson, D., Bindel, D., Shoemaker, C.: Surrogate optimization toolbox (pysot).
https://github.com/dme65/pySOT (2015)
9. Hadka, D., Reed, P.: Borg: an auto-adaptive many-objective evolutionary comput-
ing framework. Evol. Comput. 21(2), 231–259 (2013)
10. Horn, D., Wagner, T., Biermann, D., Weihs, C., Bischl, B.: Model-Based Multi-
Objective Optimization: Taxonomy, Multi-point Proposal, Toolbox and Bench-
mark, pp. 64–78. Springer International Publishing, Cham (2015)
11. Knowles, J.: ParEGO: a hybrid algorithm with on-line landscape approximation
for expensive multiobjective optimization problems. IEEE Trans. Evol. Comput.
8(5), 1341–66 (2006)
12. Kollat, J.B., Reed, P.M.: A computational scaling analysis of multiobjective evolu-
tionary algorithms in long-term groundwater monitoring applications. Adv. Water
Resour. 30(3), 335–353 (2007)
13. Laumanns, M., Thiele, L., Deb, K., Zitzler, E.: Combining convergence and diver-
sity in evolutionary multiobjective optimization. Evol. Comput. 10(3), 263–282
(2002)
14. Mueller, J.: Socemo: Surrogate optimization of computationally expensive multi-
objective problems. INFORMS J. Comput. 29(4), 581–596 (2017)
15. Regis, R.G., Shoemaker, C.A.: A stochastic radial basis function method for the
global optimization of expensive functions. INFORMS J. Comput. 19(4), 497–509
(2007)
16. Regis, R.G., Shoemaker, C.A.: Combining radial basis function surrogates dynamic
coordinate search in high dimensional expensive black-box optimization. Eng.
Optim. 45(5), 529–555 (2013)
17. Shoemaker, C.A., Regis, R.G., Fleming, R.C.: Watershed calibration using multi-
start local optimization and evolutionary optimization with radial basis function
approximation. Hydrol. Sci. J. 52(3), 450–465 (2007)
18. Tolson, B., Shoemaker, C.: Cannonsville reservoir watershed SWAT2000 model
development, calibration and validation. J. Hydrol. 337, 68–86 (2007)
19. Zhang, Q., Li, H.: MOEA/D: a multiobjective evolutionary algorithm based on
decomposition. IEEE Trans. Evol. Comput. 11(6), 712–731 (2007)
Tropical Analogues of a Dempe-Franke
Bilevel Optimization Problem
Sergeı̆ Sergeev1(B) and Zhengliang Liu2

1
School of Mathematics, University of Birmingham,
Edgbaston, Birmingham B15 2TT, UK
s.sergeev@bham.ac.uk
2
Queen Mary University of London, London E1 4NS, UK
zliu082@gmail.com
Abstract. We consider the tropical analogues of a particular bilevel

optimization problem studied by Dempe and Franke [4] and suggest some
methods of solving these new tropical bilevel optimization problems. In
particular, it is found that the algorithm developed by Dempe and Franke
can be formulated and its validity can be proved in a more general setting,
which includes the tropical bilevel optimization problems in question. We
also show how the feasible set can be decomposed into a finite number
of tropical polyhedra, to which the tropical linear programming solvers
can be applied.
Keywords: Tropical · Max-plus · Bilevel optimization
1 Introduction
Bilevel programming problems are hierarchical optimization problems with two
levels, each of which is an optimization problem itself. The upper level problem
models the leader’s decision making problem whereas the lower level problem
models the follower’s problem. These two problems are coupled through common
variables.
Consider a particular problem formulated by Dempe and Franke [4]:
min aT x + bT y
x,y
s.t. x ∈ P1 , y ∈ P2 , (1)
T
y ∈ arg min

{x y : y ∈ P2 }.
y
Here P1 and P2 are polyhedra in Rn , commonly given as solution sets to some

systems of affine inequalities.
Supported by EPSRC grant EP/P019676/1.

https://doi.org/10.1007/978-3-030-21803-4_69
692 S. Sergeev and Z. Liu
Our goal is to study some analogues and generalisations of Problem (1)

over the tropical (max-plus) semiring. Our main motivation is theoretical: Prob-
lem (1) can be considered as one of the (non-equivalent) bilevel linear program-
ming problem formulations whose tropical analogues are of interest, (2) and as
we will show, the algorithm for solving (1) can be extended to the tropical set-
ting. Following the ideas outlined in [7], one can think of practical applications
of the tropical analogues of (1) (or other linear bilevel problems) in the static
analysis of computer programs by abstract interpretation.
The tropical semiring Rmax = (R ∪ {−∞}, ⊕, ⊗) is the set of real numbers R
with −∞, equipped with the “tropical addition” ⊕, which is taking the maximum
of two numbers, and “tropical multiplication”, which is the ordinary addition [3].
Thus we have: a ⊕ b := max(a, b) and a ⊗ b := a + b, and the elements 0 := −∞,
respectively 1 := 0, are neutral with respect to ⊕ and ⊗. These arithmetical
operations are then extended to matrices and vectors in the usual way, and the
⊗ sign for multiplication will be consistently omitted. Observe that we have
a ≥ 0 for all a ∈ Rmax and hence, for example, if c ≤ d for some c, d ∈ Rnmax ,
then we have cT x ≤ dT x for the tropical scalar products of these vectors with
any x ∈ Rnmax (note that cT x now means maxni=1 ci + xi and all matrix-vector
products are understood tropically).
The maximization and minimization problems are not equivalent in tropical
mathematics. This is intuitively clear since only one of these operations plays
the role of addition and the other is “dual” to it. Namely, the maximization
problems are usually easier. Therefore, the following four problems can be all
considered as tropical analogues of (1).
Min-min problem (or) Max-min problem:
min aT x ⊕ bT y (or) max aT x ⊕ bT y

x,y x,y
s.t. x ∈ TP1 , y ∈ arg min

{xT y : y ∈ TP2 },
y
Min-max problem (or) Max-max problem:
min aT x ⊕ bT y (or) max aT x ⊕ bT y

x,y x,y
s.t. x ∈ TP1 , y ∈ arg max

{xT y : y ∈ TP2 },
y
where a and b are vectors with entries in Rmax and TP1 and TP2 are tropical
polyhedra of Rnmax , in the sense of the following definition.
Definition 1 (Tropical Polyhedra and Tropical Halfspaces). Tropical

polyhedron is defined as an intersection of finitely many tropical affine halfs-
paces defined as
{x ∈ Rnmax | aT x ⊕ α ≤ bT x ⊕ β},
for some α, β ∈ Rmax and a, b ∈ Rnmax .
Tropical Analogues of a Dempe-Franke Bilevel Optimization Problem 693
Note that unlike the classical halfspace, the tropical halfspace is defined as a
solution set of a two-sided inequality, since we cannot move terms in the absence
of (immediately defined) tropical subtraction. Also note that any tropical poly-
hedron can be defined as a set of the form
{x ∈ Rnmax | Ax ⊕ c ≤ Bx ⊕ d}
where A, B are matrices and c, d are vectors with entries in Rmax of appropriate
dimensions. Furthermore, any tropical polyhedron is a tropically convex set in
the sense of the following definition:
Definition 2 (Tropical Convex Set and Tropical Convex Hull). A set

C ⊆ Rnmax is called tropically convex if for any two points x, y ∈ C, λ ⊕ μ = 1
then λx ⊕ μy ∈ C.
C is called the tropical convex hull of X if any point of C is a tropical convex
combination of the points of X.
Furthermore, it is well-known that any compact tropical polyhedron C ⊆

Rnmax is the tropical convex hull of a finite number of points (e.g., [2]).
2 The Min-Min and Max-Min Problem
The direct analogue of Problem 1 is the min-min problem, which we consider

together with the max-min problem. Here and below, the notation “opt” will
stand for maximization or minimization. Instead of the performance measure
aT x⊕bT y we will consider a more general function f (·, ·) : Rnmax ×Rnmax
→ Rmax ,
for which certain properties will be assumed, depending on the situation.
Thus we consider the following problem:
optx,y f (x, y)
(2)
s.t. x ∈ TP1 , y ∈ arg min

{xT y : y ∈ TP2 },
y
Using φ(x) := min{xT y : y ∈ TP2 } we can write the lower level value function
y
(LLVF) reformulation of (2):
optx,y f (x, y)
(3)
s.t. x ∈ TP1 , y ∈ TP2 , xT y ≤ φ(x).
Further we will assume that f (x, y) is continuous and TP1 and TP2 are
compact in the topology1 induced by the metric ρ(x, y) = maxi |exi − eyi |.
Let us now introduce the following notion.
1
In other words, ef (x,y) is continuous and the sets {y ∈ Rn+ : log(y) ∈ TP1 } and
{z ∈ Rn
+ : log(z) ∈ TP1 } are compact in the usual Euclidean topology.
Definition 3 (Min-Essential Sets). Let TP be a tropical polyhedron. Set

S is called a min-essential subset of TP, if for any x ∈ Rnmax the minimum
minz {xT z : z ∈ TP} is attained at a point of S.
Lemma 1. If S ⊆ TP is a min-essential set of TP and S1 ⊆ S2 ⊆ TP then S2

is also min-essential.
Inspired by Dempe and Franke [4] we suggest to generalize their algorithm

in order to solve (2) in the form of (3). Here Smin (TP2 ) denotes a min-essential
subset of TP2 .
Algorithm 1 (Solving Min-min Problem and Max-min Problem)
1. Initial step. Find a pair (x0 , y 0 ) solving the relaxed problem
optx,y f (x, y)
(4)
s.t. x ∈ TP1 , y ∈ TP2 .
We verify whether y 0 ∈ arg miny {xT0 y : y ∈ TP2 }. If “yes” then stop, (x0 , y 0 )
is a solution.
If not then find a point z 0 of Smin (TP2 ) that attains miny {(x0 )T y : y ∈
TP2 }. Let Z (0) = {z 0 }.
2. General step. Find a pair (xk , y k ) solving the problem
optx,y f (x, y)
(5)
s.t. x ∈ TP1 , y ∈ TP2 , xT y ≤ min xT z.
z∈Z (k−1)
We verify whether y k ∈ arg miny {(xk )T y : y ∈ TP2 }. If “yes” then stop,

(xk , y k ) is a solution.
If not then find a point z k ∈ Smin (TP2 ) that attains miny {(xk )T y : y ∈
TP2 }. Let Z (k) = Z (k−1) ∪ {z k } and repeat 2 with k := k + 1.
We now include the proof of convergence and validity of this algorithm,

although it just generalizes the one given by Dempe and Franke [4].
Theorem 1. Let Smin (TP2 ) be finite. Then Algorithm 1 terminates in a finite

number of steps and results in a globally optimal solution of (1).
Proof. First observe that as TP1 and TP2 are compact then the feasible set
of (4) is also compact. The feasible set of (5) is also compact as intersection of
the compact set TP1 × TP2 with the closed set
{(x, y) : xT y ≤ xT z ∀z ∈ Z (k−1) }. (6)
As f (x, y) is continuous as a function of (x, y), the optima in (4) and (5) always
exist.
Now consider the sequence {z k }∞ k=0 generated by the algorithm. Points z

k
belong to a finite (min-essential) subset of TP2 and hence there exist k1 and k2
such that k1 < k2 and z k1 = z k2 . However, z k1 ∈ Z (k2 −1) and hence
min (xk2 )T z ≤ (xk2 )T z k1 = min (xk2 )T z ≤ min (xk2 )T z.

z∈Z (k2 −1) z∈TP2 z∈Z (k2 −1)
The inequalities turn into equalities, and (xk2 , z k2 ) is a globally optimal solution
since it is feasible for (2) and globally optimal for its relaxation (5).
Let us now argue that a finite min-essential set exists for each tropical poly-
hedron TP.
Definition 4 (Minimal Points). Let TP be a tropical polyhedron. A point

x ∈ TP is called minimal if y ≤ x and y ∈ TP imply y = x. The set of all
minimal points of TP is denoted by M(TP).
Definition 5 (Extreme Points). Let TP be a tropical polyhedron. A point

x ∈ TP is called extreme if any equality x = λu ⊕ μv with λ ⊕ μ = 1 and
u, v ∈ TP implies x = u or x = v.
We have the following known observation. Note, however, that this observa-
tion does not hold in the usual convexity, as counterexamples on the plane can
be easily constructed.
Lemma 2 (Helbig [8]). Any minimal point of a tropical polyhedron is extreme.
The set of extreme points of a tropical polyhedron is finite, see for example
Allamigeon, Gaubert and Goubault [2]. Combining this with an observation that
the set {z ∈ TP : z ≤ y} is compact and hence contains a minimal point, we
obtain the following claims.
Proposition 1. M(TP) is a finite (and non-empty) min-essential subset for

any tropical polyhedron TP.
Corollary 1. Any tropical polyhedron has a finite min-essential subset.
Several problems arise when trying to implement the general Dempe-Franke

algorithm in tropical setting. One of them is how to find a point of a finite
min-essential set Smin (TP2 ) that attains miny {(xk )T y : y ∈ TP2 } and which
min-essential set to choose. An option here is to exploit the tropical simplex
method of Allamigeon, Benchimol, Gaubert and Joswig [1], which (under some
generically true conditions imposed on TP2 ) can find a point that attains
miny {(xk )T y : y ∈ TP2 } and belongs to the set of tropical basic points of
TP2 . The set of tropical basic points is finite and includes all extreme points [1]
and hence all the minimal points of TP2 , thus it is also a finite min-essential
subset of TP2 by Lemma 1.
Even more imminent problem is how to solve (5), as the techniques referred
to in Dempe and Franke [4] are not immediately “tropicalized”. An option here
is to use reduction of the constraints defining a tropical polyhedron to MILP

constraints. Such reduction was suggested, e.g., in De Schutter, Heemels and
Bemporad [6] based on [5]. More precisely, we need to consider constraints of
the following two kinds: (1) aT x ≤ α and (2) aT x ≥ α. Constraints of the first
type are easy to deal with, since this is the same as to write ai + xi ≤ α for all
i, in terms of the usual arithmetic. Constraints of the second type mean that
ai + xi ≥ α for at least
one i, and this can be written as ai + xi + (1 − wi )M ≥ α,
where wi ∈ {0, 1} and i wi = 1, with M a sufficiently large number. One can
see that this reduction to MILP also applies to the constraints in (6). Combining
these techniques with the general Dempe-Franke algorithm is a matter of ongoing
research.
Let us now discuss another approach to solving the problem
min f (x, y)
x,y
(7)
s.t. x ∈ TP1 , y ∈ TP2 , xT y ≤ min xT y ,
y ∈TP2
where f (x, y) is isotone with respect to the second argument: f (x, y 1 ) ≤ f (x, y 2 )
whenever y 1 ≤ y 2 . We can observe the following.
Proposition 2. If f (x, y) is isotone with respect to the second argument then

the minimum in (7) is equal to the minimum in the following problem:
min f (x, y)
x,y
s.t. x ∈ TP1 , y ∈ M(TP2 ), xT y ≤ min xT z.

z∈M(TP2 )
This proposition provides for the following straightforward procedure solv-

ing (7) (and, in particular, Min-min Problem):
Algorithm 2 (Solving (7) and Min-min Problem)
Step 1. Identify the set of minimal points M(TP2 ).
Step 2. For each point y ∈ M(TP2 ) we solve the following optimization prob-
lem:
min f (x, y )
x
(8)
s.t. x ∈ TP1 , xT y ≤ xT z ∀z ∈ M(TP2 ).
Step 3. Find the minimum among all Problems (8) for all y ∈ TP2 .
Note that when f (x, y) = aT x ⊕ bT y for some vectors a, b over Rmax , Prob-
lem (8) can be solved by any algorithm of tropical linear programming [1,3,7].
The set of all minimal points can be found by a combination of the tropical
double description method of [2] that finds the set of all extreme points and the
techniques of Preparata et al. for finding all minimal points of a finite set [9],
although clearly a more efficient procedure should be sought for this purpose.
2.1 The Max-Max and Min-Max Problems

Let us now consider the problems where the lower-level objective is to maximize
rather than to minimize:
optx,y f (x, y)
(9)
s.t. x ∈ TP1 , y ∈ arg max

{xT y : y ∈ TP2 }.
y
Following the LLVF approach, (9) is equivalent to
optx,y f (x, y)
(10)
s.t. x ∈ TP1 , y ∈ TP2 , xT y = φ(x),
where φ(x) = maxz {xT z : z ∈ TP2 }. The following are similar to Definitions 4
and 3.
Definition 6 (Maximal Points). Let TP be a tropical polyhedron. A point
x ∈ TP is called maximal if y ≥ x and y ∈ TP imply y = x.
Definition 7 (Max-Essential Subset). Let TP be a tropical polyhedron. Set
Smax is called a max-essential subset of TP, if for any x ∈ Rnmax the maximum
maxz {xT z : z ∈ TP} is attained at a point of Smax .
However, it is immediate that each compact tropical polyhedron contains its
greatest point, and the above notions trivialize.
Proposition 3. Let TP be a compact tropical polyhedron. Then TP contains its
greatest point y max .Furthermore, the singleton {y max } is a max-essential subset
of TP.
Proposition 3 implies that (9) and (10) are equivalent to
optx,y f (x, y)
(11)
s.t. x ∈ TP1 , y ∈ TP2 , xT y = xT y max ,
where y max is the greatest point of TP2 . The following result yields an immediate
solution of the max-max problem.
Corollary 2 (Solving Max-max Problem). If f (x, y) is isotone with respect
to both arguments and opt = max, then (xmax , y max ) is a globally optimal solu-
tion of (9), where xmax and y max are the greatest points of TP1 and TP2 .
Let us now consider (11) where f is not necessarily isotone, or where opt =
min as in the case of Min-max problem. Suppose that y max has all components
in R and define point x∗ with coordinates:

x∗i = ykmax .
k=i
We first prove the following claim.

Lemma 3. Let y max ∈ Rn . Consider sets I and J such that I ∪ J = [n] and
I ∩ J = ∅. Let x be such that
xi = x∗i ∀i ∈ I,
(12)
xi < x∗i ∀i ∈ J.
Then, if y ∈ TP2 , equation xT y = xT y max is equivalent to

⎛ ⎞

⎝ ykmax ⎠ yi = ykmax . (13)
i∈I k=i k∈[n]
Proof. Observe that y ∈ TP2 implies y ≤ y max . With such x as in (12) and y
such that y ≤ y max , we have
⎛ ⎞

xT y max = x∗i yimax ⊕ xj yjmax = ⎝ ykmax ⎠ yimax = yimax ,
i∈I j∈J i∈I k=i k∈[n]
⎛ ⎞

xT y = x∗i yi ⊕ xj yj = ⎝ ykmax ⎠ yi ⊕ xj yj .
i∈I j∈J i∈I k=i j∈J
Therefore, xT y = xT y max becomes

⎛ ⎞

⎝ ykmax ⎠ yi ⊕ xj yj = ykmax . (14)
i∈I k=i j∈J k∈[n]

Moreover since xj < x∗j we obtain that xj yj < k∈[n] ykmax (= x∗j yjmax ) for each
j ∈ J. Hence we can further simplify (14) to (13).
Let us also introduce the following notation:
TPIJ ∗ −1
1 = {x ∈ TP1 : xj (xj ) < xi (x∗i )−1 ∀i ∈ I, j ∈ J,
xk (x∗k )−1 = xl (x∗l )−1
∀k, l ∈ I}
⎛ ⎞ (15)

TPIJ
2 = {y ∈ TP2 : ⎝ ykmax ⎠ yi = ykmax }
i∈I k=i k∈[n]
Note that “xj (x∗j )−1 ” means xj − x∗j in the usual arithmetics. Now, using
Lemma 3 we can prove the following.
Theorem 2. We have the following decomposition:

{(x, y) ∈ TP1 × TP2 : xT y = xT y max } = {(x, y) ∈ TPIJ IJ

1 × TP2 }
I,J
where the union is taken over I and J are such that I ∩ J = ∅ and I ∪ J = [n].
Theorem 2 suggests that Problem (11) (and, equivalently, (9)) can be solved
by the following straightforward procedure.
Algorithm 3 (Solving (9) and Min-max Problem)
Step 1. For each partition I, J of [n], identify the system of inequalities (15)
defining TPIJ IJ
1 and TP2 and find a solution of the problem optx,y f (x, y) over
(x, y) ∈ TPIJ IJ
1 × TP2 , if such solution exists.
Step 2. Compute opt over all solutions found at Step 1.
When f (x, y) = aT x ⊕ bT y, this procedure reduces the problem to a finite

number of tropical linear programming problems solved, e.g., by the algorithms
of [1,3,7].
Example 1. Consider the following numerical example in two-dimensional case.

Let TP1 is the tropical (max-plus) convex hull of the points (−3, −1), (−1, 0)
and (−2, −3). See Fig. 1a. TP2 is defined by (1, 1), (0, 0) and (2, −1). See Fig. 1b.
(−1, 0)
(−3, −1) (−2, −1) (1, 1) y max
(0, 0)
(−2, −3) (2, −1)
(a) (b)
Fig. 1. TP1 and TP2 of Example 1.
In this example, y max = (2, 1) (the greatest point of TP2 in Fig. 1b). There-
fore, x∗ = (1, 2). Table 1 shows three possible partitions of TP1 and TP2 . Par-
tition 1 corresponds to the line segment between (−2, −1) and (−2, −3) in TP1
and the line segment connecting y max and (2, −1) in TP2 (red). Partition 2 cor-
responds to the line segment between (−2, −1) and (−3, −1) in TP1 and the
line segment connecting y max and (1, 1) in TP2 (blue). Partition 3 corresponds
to the line segment between (−2, −1) and (−1, 0) in TP1 (green) and in TP2
the union of the line segment connecting y max and (1, 1) and the line segment
between y max and (2, −1) (green).
Assume the upper level objective is of the form min aT x ⊕ bT y, where a,
b ∈ R2 . In ordinary algebra it can be written as min {max{a1 + x1 , a2 + x2 , b1 +
y1 , b2 + y2 }}. It is obvious that the objective function is isotone with respect to x
Table 1. Partitions of Example 1
I J TPIJ
1 TPIJ
2
1 {1} {2} {x ∈ TP1 : x2 − 2 < x1 − 1} {y ∈ TP2 : y1 = 2}

2 {2} {1} {x ∈ TP1 : x1 − 1 < x2 − 2} {y ∈ TP2 : y2 = 1}
3 {1,2} ∅ {x ∈ TP1 : x1 − 1 = x2 − 2} {y ∈ TP2 : max(1 + y1 , 2 + y2 ) = 3}
and y. In partition 1, x = (−2, −3) and y = (2, −1) is always a solution regardless
of a and b. In partition 2, x = (−3, −1) and y = (1, 1) is a solution. In partition
3, either x = (−2, −1) and y = (1, 1) or x = (−2, −1) and y = (2, −1) solve the
problem. However, these solutions are always dominated by the optimal points of
partition 1 and partition 2. Therefore, in this example, it is sufficient to consider
only partition 1 and partition 2. and decide between (x, y)1 = ((−2, −3), (2, −1))
and (x, y)2 = ((−3, −1), (1, 1)). Taking a1 = a2 = b1 = b2 makes (x, y)2 an
optimal solution of the problem, but taking a2 = 10 and a1 = b1 = b2 results in
(x, y)1 .
3 Conclusions and Acknowledgement

We have studied the four different tropical analogues of a problem considered by
Dempe and Franke [4]. We showed that we can solve the problems by generalizing
the Dempe-Franke algorithm and using reduction to MILP, or by decomposing
the feasible set of a problem into a number of tropical polyhedra and performing
tropical linear programming over these subdomains. The resulting methods need
further practical study and theoretical improvement.
We gratefully acknowledge fruitful communication with Bart De Schutter
and Ton van den Boom (TU Delft), who informed us about the reduction of
tropical optimization problems to MILP.
References
1. Allamigeon, X., Benchimol, P., Gaubert, S., Joswig, M.: Tropicalizing the simplex
algorithm. SIAM J. Discrete Math. 29(2), 751–795 (2015)
2. Allamigeon, X., Gaubert, S., Goubault, É.: Computing the vertices of tropical poly-
hedra using directed hypergraphs. Discrete Comput. Geom. 49, 247–279 (2013)
3. Butkovič, P.: Max-linear Systems: Theory and Algorithms. Springer, London (2010)
4. Dempe, S., Franke, S.: Solution algorithm for an optimistic linear Stackelberg prob-
lem. Comput. Oper. Res. 41, 277–281 (2014)
5. De Schutter, B., Heemels, W.P.M.H., Bemporad, A.: On the equivalence of linear
complementarity problems. Oper. Res. Lett. 30(4), 211–222 (2002)
6. De Schutter, B., Heemels, W.P.M.H., Bemporad, A.: Max-plus-algebraic problems
and the extended linear complementarity problem–algorithmic aspects. In: Proceed-
ings of the 15th IFAC World Congress. Barcelona, Spain (2002)
7. Gaubert, S., Katz, R.D., Sergeev, S.: Tropical linear-fractional programming and
parametric mean-payoff games. J. Symb. Comput. 47(12), 1447–1478 (2012)
8. Helbig, S.: On Carathéodory’s and Krein-Milman’s theorems in fully ordered groups.

Comment. Math. Univ. Carolin. 29(1), 157–167 (1988)
9. Preparata, F.P., Shamos, M.I.: Computational Geometry: An Introduction.
Φ−Weak Slater Constraint Qualification
in Nonsmooth Multiobjective
Semi-infinite Programming
Ali Sadeghieh1(B) , David Barilla2 , Giuseppe Caristi2 ,

and Nader Kanzi3
1
Department of Mathematics, Islamic Azad University, Yazd, Iran
alijon.sadeghieh@gmail.com
2
Department of Economics, University of Messina, Via dei Verdi, 75, Messina, Italy
{dbarilla,gcaristi}@unime.it
3
Department of Mathematics, Payam Noor University, 19395-3697, Tehran, Iran
nad.kanzi@gmail.com
Abstract. In this paper, we consider a nonsmooth multiobjective semi-

infinite programming problem with a feasible set defined by inequal-
ity constraints. First we introduce the weak Slater constraint qualifica-
tion, and derive the Karush-Kuhn-Tucker types necessary conditions for
(weakly, properly) efficient solution of the considered problem. Then, we
introduce a new gap function for the problem and state sufficient condi-
tions for (weakly, properly) efficient solution of the problem via this gap
function.
Keywords: Semi-infinite programming · Multiobjective optimization ·

Constraint qualification · Optimality conditions · Gap function
1 Introduction
A multiobjective semi-infinite programming (MOSIP in brief) is an optimization

problem where two or more objectives are to be minimized on a set of feasible
solutions described by infinitely many inequality constraint functions. Optimal-
ity and duality conditions of MOSIP have been studied by many authors; see
for instance [9,16] in linear case, [8,10] in convex case, [4] in smooth case, and
[20,21] in locally Lipschitz case. In almost all of the articles in MOSIP theory, the
Fritz-John type (Karush-Kuhn-Tucker type) necessary optimality conditions are
justified for continuous problems (under Slater constraint qualification); contin-
uous MOSIPs and Slater constraint qualification will be defined in Sect. 3. The
first aim of this paper is to replace this conditions by two weaker conditions,
named PLV property, and weak Slater constraint qualification. Another aim of
this paper is to present a sufficient condition involved a new gap function with
is defined by nonsmooth version of (Φ, ρ)−invexity [4,5]. Of course, it should be
mentioned that, in this study, if we replace “(Φ, ρ)−invex” by “invex”, the results
https://doi.org/10.1007/978-3-030-21803-4_70
Φ−Weak Slater Constraint Qualification in Nonsmooth 703
will still be original which are the extensions of the existing theorems in other
articles and we have added the concept of (Φ, ρ)−invexity to these extensions so
that our results could be more general. We organize the paper as follows. In the
next section, we provide the notations to be used in the rest of the paper and in
Sect. 3, we present our main results.
2 Notations
In this section, we briefly overview some notions of nonsmooth analysis widely
used in formulations and proofs of main results of the papers [6,15]. As usual,
||x|| stands for the Euclidean norm of x ∈ Rn , and Bn denotes the closed unit ball
in Rn . Given x, y ∈ Rn , we write x y (resp. x < y) when xi ≤ yi (resp. xi < yi )
for all i ∈ {1, . . . , n}. Moreover, we write x ≤ y when x y and x = y. The
zero vector of Rn is denoted by 0n . Given a nonempty set A ⊆ Rn , we denote
by A, conv(A), and cone(A), the closure of A, the convex hull and convex cone
(containing the origin) generated by A, respectively. Also, we denote the Clarke
tangent cone of A at x̂ ∈ A by Γ (A, x̂), i.e.,

Γ (A, x̂) := v ∈ Rn | ∀{xr } ⊆ A, xr → x̂, ∀tr ↓ 0,

∃vr → v such that xr + tr vr ∈ A ∀r ∈ N .
Let x̂ ∈ Rn and let ϕ : Rn → R be a locally Lipschitz function. The Clarke direc-
tional derivative of ϕ at x̂ in the direction v ∈ Rn , and the Clarke subdifferential
of ϕ at x̂ are respectively given by
ϕ(y + tv) − ϕ(y)
ϕ0 (x̂; v) := lim sup
y→x̂, t↓0 t
and
∂c ϕ(x̂) := ξ ∈ Rn |
ξ, v ≤ ϕ0 (x̂; v) for all v ∈ Rn .
It is worth to observe that if x̂ is a minimizer of locally Lipschitz function φ on
a set C, then
0 ∈ ∂c φ(x̂) + N (C, x̂),
where N (C, x̂) denotes the Clarke normal cone of C at x̂, i.e.,

N (C, x̂) := x ∈ Rn |
x, a ≤ 0, ∀a ∈ Γ (C, x̂) .
3 Main Results
In this paper, we consider the following multiobjective semi-infinite programming
problem:

(P) inf f1 (x), f2 (x), . . . , fp (x)
s.t. gt (x) ≤ 0 t ∈ T,
x ∈ Rn ,
704 A. Sadeghieh et al.
where fi , i ∈ I := {1, 2, . . . , p} and gt , t ∈ T are locally Lipschitz functions from

Rn to R, and the index set T is arbitrary, not necessarily finite (but nonempty).
An important feature of problem (P ) is that the index set T is arbitrary, i.e.,
may be infinite and also noncompact. When T is finite, (P ) is a multiobjective
optimization problem and when p = 1 and T is infinite, (P ) is a semi-infinite
optimization problem. The feasible set of (P ) is denoted by M , i.e.,
M := {x ∈ Rn | gt (x) ≤ 0, ∀t ∈ T }.
For each x̂ ∈ M , set

Fx̂ := ∂c fi (x̂) and Gx̂ := ∂c gt (x̂),
i∈I t∈T (x̂)
where T (x̂) := {t ∈ T | gt (x̂) = 0}. A feasible point x̂ is said to be efficient

solution [resp. weakly efficient solution] for (P ) if and only if there is no x ∈ M
satisfying f (x) ≤ f (x̂) [resp. f (x) < f (x̂)]. Recall that, the problem (P ) is said to
be continuous when T is a compact metric space, gt (x) is a continuous function
of (t, x) in T × Rn , and t → ∂c gt (x) is an upper semicontinuous (set-valued)
mapping for each x ∈ Rn . At almost all articles in (multiobjective) semi-infinite
programming, the continuity of problem is assumed, even in differentiable case
(see, e.g., [4]). The continuity of (P ) implies the compactness of Fx̂ ∪ Gx̂ by [20];
then, the strict separation theorem implies that 0 ∈ conv(Fx̂ ∪ Gx̂ ) when x̂ is
a weakly efficient solution of the considered problem; and hence, the Fritz-John
(FJ) type necessary condition is satisfied for continuous (P ) at a weakly efficient
solution. For extension of this well-known result to non-continuous (P ), we recall
the following definition from [18,19].
Definition 1. We say that (P ) has the Pshenichnyi-Levin-Valadier (PLV in
short) property at x̂ ∈ M , if Ψ(·) is finite-valued Lipschitz around x̂, and

∂c Ψ(x̂) ⊆ conv ∂c gt (x̂) = conv Gx̂ ,
t∈T (x̂)
where, Ψ(·) is defined as
Ψ(x) := sup gt (x), ∀x ∈ M.

t∈T
It should be observed from [8,19] that the PLV property is strictly weaker than
continuity for (P ). Thus, the following simple theorem is better than its contin-
uous versions (see, e.g., [4,17]).
Theorem 1. (FJ necessary condition) Let x̂ be a weakly efficient solution of
(P ). If the PLV property holds at x̂, then there exist αi ≥ 0 (for i ∈ I), and
βt ≥ 0, (for t ∈ T (x̂)), with βt = 0 for finitely many indexes, such that
p
p

0n ∈ αi ∂c fi (x̂) + βt ∂c gt (x̂), and αi + βt = 1.
i=1 t∈T (x̂) i=1 t∈T (x̂)
Proof. It is easy to see that x̂ is a global minimizer for the function
ϑ(x) := max{Θ(x), Ψ(x)},
where, Θ(x) := maxi∈I {fi (x) − fi (x̂)} and Ψ(x) is defined as Definition 1. Thus,
by PLV property we deduce that

0n ∈ ∂c ϑ(x̂) ⊆ conv ∂c Θ(x̂) ∪ ∂c Ψ(x̂) ⊆ conv conv(Fx̂ ) ∪ conv(Gx̂ ) .
This means 0n ∈ conv(Fx̂ ∪ Gx̂ ), as required.
As well as in the classical case, the optimality implies the Karush-Kuhn-Tucker

(KKT) condition provided some constraint qualifications are satisfied (see, e.g.,
[4,8–10,20,21]). Slater’s condition (SC) is said to be satisfied for problem (P ) if
there exits a Slater point, i.e., ∃x∗ ∈ Rn such that gt (x∗ ) < 0 for all t ∈ T . It
is easy to see that the SC can not play the role of such constraint qualification
without additional convexity assumption on the restriction functions. Also, the
SC works only for continuous (multiobjective) semi-infinite problems (SIP). In
fact, the Slater constraint qualification (SCQ) for (multiobjective) SIP consists
the following three conditions [22]:
⎧
⎨ (I): Continuity of (P ).
(II): SC.
⎩
(III): Convexity of gt functions for t ∈ T.
In the line of extension of KKT necessary condition for (P ), we are going

to change: (I) the continuity of (P ) by PLV property; (II) the SC by weak
Slater’s condition, introduced in Definition 2 below; (III) the convexity of gt s by
(Φ, ρ)−invexity of them, defined in Definition 3 below.
We will use the following weaker form of SC for SIP appeared in [14].
Definition 2. We say that the weak Slater’s condition (WSC in brief ) is sat-
isfied for (P ) at x0 ∈ M if for each finite index set T ∗ ⊆ T (x0 ), there exists a
point xT ∗ ∈ Rn such that gt (xT ∗ ) < 0 for all t ∈ T ∗ . Also, (P ) is said satisfies
in global WSC (GWSC, briefly) if for each finite index set T ∗ ⊆ T , there exists
a point xT ∗ ∈ Rn such that gt (xT ∗ ) < 0 for all t ∈ T ∗ .
Clearly, WSC at a point is strictly weaker than GWSC. The following exam-
ple shows that the GWSC is strictly weaker than SC.
Example 1. Let T := N ∪ {0}, and

1
x− if t ∈ N,
gt (x) := t
−x if t = 0.
It is easy to check that M = {0}, and SC does not hold. Now, assume that T ∗
is a finite subset of T . Take max(T ∗ ) := q and xT ∗ := q+1
1
. Then
⎧
⎪ 1 1 1
⎪
⎪ x − = − <0 if t ∈ N ∩ T ∗,
⎨ T∗ t q+1 t
gt (xT ∗ ) =
⎪
⎪ 1
⎪
⎩ −xT ∗ = − <0 if t = 0.
q+1
Thus the GWSC holds.
Definition 3. Suppose that the functions Φ : Rn × Rn × Rn × R → R and

ρ : Rn × Rn → R, and the nonempty set X ⊆ Rn are given. A locally Lipschitz
function : Rn → R is said to be (Φ, ρ)−invex at x∗ ∈ X with respect to X, if
for each x ∈ X one has:

Φ x, x∗ , 0n , r ≥ 0 for all r ≥ 0, (1)
Φ(x, x∗ , ., .) is convex on Rn × R, (2)

Φ x, x∗ , ξ, ρ(x, x∗ ) ≤ (x) − (x∗ ), ∀ξ ∈ ∂c (x∗ ). (3)
As mentioned in [1], the definition of (Φ, ρ)−invexity generalizes the almost all
concepts of invexity and convexity.
Example 2. Consider a function Φ : R × R × R × R → R defined by
⎧ u 3
⎪
⎨ w − 3y 2 |x − y |
3
if y = 0,
Φ(x, y, u, w) :=
⎪
⎩
w|x3 | if y = 0.
Let x and x̂ be arbitrary elements of R. Since Φ(x, x̂, ., .) is a linear function and

r if y = 0,
Φ(x, y, 0, r) =
r|x3 | if y = 0,
thus the conditions (1) and (2) hold. Take ρ(x, y) := −1 for all x, y ∈ R, and
(x) := x3 . It is easy check that (3) holds too. Furthermore, as it follows by [3],
(.) is not an invex function on R with respect to any η : R × R → R.
Everywhere in the following, we will assume X equals to feasible solution of
(P ), i.e., X = M , but for the sake of simplicity we will omit to mention X. The
following definition is motivated by above comments.
Definition 4. Let Φ : Rn × Rn × Rn × R → R be a given function, and x̂ ∈ M .
We say that (P ) satisfies the Φ-weak SCQ (Φ-WSCQ, briefly) at x̂, if
– the PLV property holds at x̂,

– WSC is satisfied at x̂,
– for each t ∈ T (x̂), the gt function is (Φ, ρt )−invex at x̂ for some given function
ρt : Rn × Rn → R.
Normally, we are interested to show Karush-Kuhn-Tucker necessary condition
for (P ) under Φ-WSCQ assumption. In fact, the following theorem guarantees
that Φ-WSCQ is a constraint qualification.
Theorem 2. (KKT necessary condition for weakly efficient solutions) Let x̂ be
a weak efficient solution of (P ). Suppose that the Φ-WSCQ is satisfied at x̂ with
ρt (x, x̂) ≥ 0 for every (x, t) ∈ Rn × T . Then, there exist αi ≥ 0 (for i ∈ I) with
p
i=1 αi = 1, and βt ≥ 0, (for t ∈ T (x̂)), with βt = 0 for finitely many indexes,
such that
p
0n ∈ αi ∂c fi (x̂) + βt ∂c gt (x̂). (4)
i=1 t∈T (x̂)
Proof. Applying Theorem 1, we find some αi ≥ 0 and ξi ∈ ∂c fi (x̂) for i ∈ I,

βt ≥ 0 and ζt ∈ ∂c gt (x̂) for t ∈ T ∗ ⊆ T (x̂) with |T ∗ | < ∞, such that

αi ξi + βt ζt = 0n , αi + βt = 1.
i∈I t∈T ∗ i∈I t∈T ∗
All we need to prove is that at least one αi should be positive. If it is not this
case, then
βt ζt = 0n , βt = 1.
t∈T ∗ t∈T ∗

Thus, owing to βt ρt (xT ∗ , x̂) ≥ 0 and Definition 2, we get
t∈T ∗

0 ≤ Φ xT ∗ , x̂, βt ζt , βt ρt (xT ∗ , x̂)
t∈T ∗ t∈T ∗

≤ βt Φ xT ∗ , x̂, ζt , ρ(xT ∗ , x̂)
t∈T ∗

≤ βt gt (xT ∗ ) − gt (x̂) = βt gt (xT ∗ ) < 0.
t∈T ∗ t∈T ∗
This contradiction justifies the result.

Proper efficiency is a very important notion used in studying multiobjective
optimization problems. There are many definitions of proper efficiency in litera-
ture, as those introduced by Geoffrion, Benson, Borwein, and Henig; see [12] for
a comparison among the main definitions of this notion. We recall the following
definition from [11].
Definition 5. A point x̂ ∈ M is called a properly efficient solution of (P ) when
there exists a λ > 0p such that

λ, f (x̂) ≤
λ, f (x), ∀x ∈ M.
As proved in [7], the above definition of proper efficiency is weaker than its
other definitions (under some assumed conditions). Thus, the following theorem
can be extended to other sense of proper efficiency under further assumptions.
Theorem 3. (KKT strong necessary condition for properly efficient solutions)
Let x̂ be a proper efficient solution of (P ). Suppose that the Φ-WSCQ is satisfied
n
px̂ with ρt (x, x̂) ≥ 0 for every x ∈ R . Then, there exist αi > 0 (for i ∈ I) with
at
i=1 αi = 1, and βt ≥ 0, (for t ∈ T (x̂)), with βt = 0 for finitely many indexes,
such that (4) fulfilled.
Proof. By the definition of proper efficiency, there exist some scalars λi > 0 (for
i ∈ I) such that x̂ is a minimizer of the following scalar semi-infinite problem:
p

min λi fi (x).
x∈M
i=1
Applying Theorem 2, we get

p p

0n ∈ ∂ λi fi (·) (x̂) + μt ∂c gt (x̂) ⊆ λi ∂c fi (x̂) + μt ∂c gt (x̂),
i=1 t∈T (x̂) i=1 t∈T (x̂)
for some μt ≥ 0, (t ∈ T (x̂)), with μt = 0 for finitely many indexes. For each
i ∈ I take αi := pλi λi , and for each t ∈ T (x̂) put βt := pμt λi .
i=1 i=1
Since 1982, an important function respect to convex optimization problems was

defined by Hearn [13]. The gap functions received attention from some fields
of their applications in analyze of mathematical programming and variational
inequalities. For the history and applications of gap functions, the reader is
referred to recent paper [3] and its references. It is worth mentioning that in all
existing literatures the gap function was defined for optimization programming
with convex or quasiconvex or invex objective function. Now, we define the gap
function for nonsmooth MOSIPs with (Φ, ρ)-invex objective function.
Definition 6. Suppose that the fi functions are (Φ, ρi )-invex at x ∈ M . For
each
p
p

ξ := (ξ1 , . . . , ξp ) ∈ ∂c fi (x) and λ := (λ1 , . . . , λp ) ≥ 0p with λi = 1,
i=1 i=1
the gap function of problem (P ) is defined as

p

Υ (x, ξ, λ) := inf λi Φ(y, x, ξi , ρi (y, x)) .
y∈M
i=1
Theorem 4. Let the fi function be (Φ, ρi )-invex at x̂ ∈ M for each i ∈ I.

(a) If Υ (x̂, ξ, ˆ λ̂) = 0 for some ξˆ := (ξˆ1 , . . . , ξˆp ) ∈ p ∂c fi (x̂) and λ̂ :=
p i=1
(λ̂1 , . . . , λ̂p ) ≥ 0p with i=1 λ̂i = 1, then x̂ is a weak efficient solution for (P ).
(b) If Υ (x̂, ξ, ˆ λ̂) = 0 for some ξˆ := (ξˆ1 , . . . , ξˆp ) ∈ p ∂c fi (x̂) and λ̂ :=

p i=1
(λ̂1 , . . . , λ̂p ) > 0p with i=1 λ̂i = 1, then x̂ is an efficient solution for (P ).
p
. , ξˆp ) ∈ i=1 ∂c fi (x̂) such that Υ (x̂, ξ,
(c) If there exists a ξˆ := (ξˆ1 , . . ˆ λ) = 0 for
p
all λ := (λ1 , . . . , λp ) > 0p with i=1 λi = 1, and if ρ1 = . . . = ρp =: ρ, then x̂ is
a proper efficient solution for (P ).
Proof. (a) By contradiction assume that Υ (x̂, ξ, ˆ λ̂) = 0 while x̂ is not a weak
efficient solution for (P ). Then, we can find a x0 ∈ M such that fi (x0 ) < fi (x̂)
for all i ∈ I. Thus, the (Φ, ρi )-invexity of fi functions implies that
Φ(x0 , x̂, ξî , ρi ) ≤ fi (x0 ) − fi (x̂) < 0, ∀i ∈ I.
From the latter inequality and λ̂ ≥ 0p , we deduce that

p

λ̂i Φ(x0 , x̂, ξî , ρ̂i ) < 0
i=1
ˆ λ̂) < 0. This contradiction completes the proof.

which consequences that Υ (x̂, ξ,
ˆ λ̂) = 0 while x̂ is not an efficient solution for (P ), there exist some
(b) If Υ (x̂, ξ,
x0 ∈ M and some index k ∈ I such that
fi (x0 ) ≤ fi (x̂), ∀i ∈ I, and fk (x0 ) < fk (x̂).
According to the above inequalities, the (Φ, ρi )-invexity of fi functions, and the
assumption of λ̂ > 0p , we get
p

λ̂i Φ(x0 , x̂, ξî , ρ̂i ) < 0 =⇒ Υ (x̂, ξ,
ˆ λ̂) < 0,
i=1
which contradicts the assumption.

(c) If x̂ is not a proper efficient solution for (P ), we can find some x0 ∈ M and
λ∗ := (λ∗1 , . . . , λ∗p ) > 0p such that
p
p

λ∗i fi (x0 ) < λ∗i fi (x̂).
i=1 i=1
∗ p p
Taking λi := pλi
λ∗
, we conclude that i=1 λi = 1, which implies i=1 λi fi is
i=1
i
a (Φ, ρ)−invex function at x̂ by [2]. Thus, the latter inequality implies that
p
p
p

λi Φ(x0 , x̂, ξî , ρ) ≤ λi fi (x0 ) − λi fi (x̂) < 0.
i=1 i=1 i=1
< 0, which contradicts the assumption.

ˆ λ)
This means Υ (x̂, ξ,
References
1. Antczak, T.: Saddle point criteria and Wolfe duality in nonsmooth (Φ, ρ)−invex
vector optimization problems with inequality and equality constraints. Int. J. Com-
put. Math. 92(5), 882–907 (2015)
2. Antczak, T., Stasiak, A.: (Φ, ρ)−invexity in nonsmooth optimization. Numer. Func.
Anal. Optim. 32, 1–25 (2015)
3. Caristi, G., Kanzi, M., Soleimani-damaneh, M.: On gap functions for nonsmooth
multiobjective optimization problems. Optim. Lett. (2017). https://doi.org/10.
1007/s11590-017-1110-4
4. Caristi, G., Ferrara, M., Stefanescu, A.: Semi-infinite multiobjective programming
with grneralized invexity. Math. Rep. 62, 217–233 (2010)
5. Caristi, G., Ferrara, M., Stefanescu, A.: Mathematical programming with (ρ, Φ)-
invexity. In: Konnor, I.V., Luc, D.T., Rubinov, A.M. (eds.) Generalized Convexity
and Related Topics. Lecture Notes in Economics and Mathematical Systems, vol.
583, pp. 167–176. Springer, Heidelberg (2006)
6. Clarke, F.H.: Optimization and Nonsmooth Analysis. Wiley, Interscience (1983)
7. Ehrgott, M.: Multicriteria Optimization. Springer, Berlin (2005)
8. Goberna, M.A., Kanzi, N.: Optimality conditions in convex multiobjective SIP.
Math. Program. (2017). https://doi.org/10.1007/s10107-016-1081-8
9. Goberna, M., Guerra-Vazquez, F., Todorov, M.I.: Constraint qualifications in linear
vector semi-infinite optimization. Eur. J. Oper. Res. 227, 32–40 (2016)
10. Goberna, M.A., Guerra-Vazquez, F., Todorov, M.I.: Constraint qualifications in
convex vector semi-infinite optimization. Eur. J. Oper. Res. 249, 12–21 (2013)
11. Gopfert, A., Riahi, H., Tammer, C., Zalinescu, C.: Variational Methods in Partial
Ordered Spaces. Springer, New York (2003)
12. Guerraggio, A., Molho, E., Zaffaroni, A.: On the notion of proper efficiency in
vector optimization. J. Optim. Theory Appl. 82, 1–21 (1994)
13. Hearn, D.W.: The gap function of a convex program. Oper. Res. Lett. 1, 67–71
(1982)
14. Hettich, R., Kortanek, O.: Semi-infinite programming: theory, methods, and appli-
cations. SIAM Rev. 35, 380–429 (1993)
15. Hiriart-Urruty, J.B., Lemarechal, C.: Convex Analysis and Minimization Algo-
rithms. I & II. Springer, Heidelberg (1991)
16. Kanzi, N., Shaker Ardekani, J., Caristi, G.: Optimality scalarization and duality
in linear vector semi-infinite programming. Optimization (2018). https://doi.org/
10.1080/02331934.2018.1454921
17. Kanzi, N.: Necessary and sufficient conditions for (weakly) efficient of nondiffer-
entiable multi-objective semi-infinite programming. Iran. J. Sci. Technol. Trans. A
Sci. (2017). https://doi.org/10.1007/s40995-017-156-6
18. Kanzi, N.: Necessary Optimality conditions for nonsmooth semi-infinite program-
ming problems. J. Global Optim. 49, 713–725 (2011)
19. Kanzi, N.: Constraint qualifications in semi-infinite systems and their applications
in nonsmooth semi-infinite problems with mixed constraints. SIAM J. Optim. 24,
559–572 (2014)
20. Kanzi, N.: On strong KKT optimality conditions for multiobjective semi-infinite
programming problems with Lipschitzian data. Optim. Lett. 9, 1121–1129 (2015)
21. Kanzi, N., Nobakhtian, S.: Optimality conditions for nonsmooth semi-infinite mul-
tiobjective programming. Optim. Lett. (2013). https://doi.org/10.1007/s11590-
013-0683-9
22. López, M.A., Vercher, E.: Optimality conditions for nondifferentiable convex semi-
infinite programming. Math. Program. 27, 307–319 (1983)
Data science: Machine Learning, Data
Analysis, Big Data and Computer Vision
A Discretization Algorithm for k-Means
with Capacity Constraints
Yicheng Xu1(B) , Dachuan Xu2 , Dongmei Zhang3 , and Yong Zhang1

1
Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences,
Shenzhen 518055, People’s Republic of China
{yc.xu,zhangyong}@siat.ac.cn
2
Beijing Institute for Scientific and Engineering Computing,
Beijing University of Technology, Beijing 100124, People’s Republic of China
xudc@bjut.edu.cn
3
School of Computer Science and Technology, Shandong Jianzhu University,
Jinan 250101, People’s Republic of China
zhangdongmei@sdjzu.edu.cn
Abstract. We consider capacitated k-means clustering whose object is

to minimize the within-cluster sum of squared Euclidean distances. The
task is to partition a set of n observations into k disjoint clusters satisfy-
ing the capacity constraints, both upper and lower bound capacities are
considered. One of the reasons making these clustering problems hard
to deal with is the continuous choices of the centroid. In this paper we
propose a discretization algorithm that in polynomial time outputs an
approximate centroid set with at most fractional loss of the original
object. This result implies an FPT(k,d) PTAS for uniform capacitated
k-means and makes more techniques, for example local search, possible
to apply to it.
Keywords: k-means · Capacity constraints · Discretization

algorithm · FPT PTAS
1 Introduction
Clustering problem is a task that divides data into several clusters such that the
data within a same cluster have large similarity and data between clusters have
large diversity. However, most of the clustering problems are not so tractable
than imagine, for example the k-means problem. It is one of the most classical
and fundamental problem in theoretical computer science, combinatorial opti-
mization, machine learning and artificial intelligence. As the most efficient tool
of text clustering and sentiment analysis, k-means get more and more attentions
from IT companies like Google, Baidu and Microsoft.
In practical clustering task, especially for the large-scale clusterings, most
existing clustering methods suffer from expensive computation and memory
costs. Shen et al. [10] propose a compressed k-means clustering algorithm that
https://doi.org/10.1007/978-3-030-21803-4_71
714 Y. Xu et al.
compresses high-dimensional data into short binary codes, which are well suited
for fast large-scale clustering. Also, spectral clustering is one of the most impor-
tant approaches under big data environment. Shen and Cai [3] propose a novel
approach called Landmark-based Spectral Clustering that select only a mall
amount of representative data points as the landmarks representing the original
ones. Numerical experiments show the spectral embedding of the data can be
efficiently computed with the landmark-based representation. Further researches
are in progress.
Theoretically, Matoušek [9] proposes the concept of -approximation centroid
set for the geometric k-clusterings inspiring some well known results. First, the
first constant approximation for k-means clustering [7]. Based on local search
heuristic that allows swapping centers in and out, Kanungo et al. present a (9+)-
approximation algorithm. Second, Cohen-Addad et al. [4] and Friggstad et al. [5]
independently propose a PTAS (Polynomial Time Approximation Scheme) for
a large scale of k-clusterings (including k-means) in fixed dimensional Euclidean
space.
Most results for capacitated k-clusterings are based on linear programming
techniques. However, due to the limitation of the LP-based techniques, these
results are mostly pseudo approximations that obey either the capacity con-
straints or the cardinality constraint. Byrka et al. [2] consider the capacitated
k-median and propose a bi-factor algorithm that results in two directions: vio-
lates the capacities by a factor of 2+ obtaining O(1/)-approximation or violates
the capacities by a factor of 3 + obtaining constant approximation. Later Li
[8] moves a step towards the constant approximation algorithm for capacitated
k-median by proposing an O(1/2 )-approximation algorithm which violates the
cardinality constraint by a factor of (1 + ). An et al. [1] studied the capacitated
k-center problem and propose a LP-based 9-approximation algorithm. Heuris-
tic techniques also works sometimes but not always, see the improved k-means
algorithm [6] for an example. No efficient capacitated k-means clusterings are
known.
Inspired by recent progress on k-clusterings, we considered the capacitated
k-means and propose a discretization algorithm that in polynomial time outputs
an -approximation centroid set with acceptable size. In other words, we put the
infinite continuous centroid set into intersections of a well-constructed grid with
loss of at most a factor of (1 + ). Moreover, our result implies an FPT(k,d) (i.e.
fix parameters k and d) PTAS for capacitated k-means. We hope this result is a
significant step towards constant approximations.
The remainder of this paper is organized as: In the next section we intro-
duce some basic concepts, mainly the -approximate centroid set. In the third
section we show how the -approximate centroid set works and how to deal with
capacitated k-means when we are given centroid set. We build the -approximate
centroid set for simple instances and then extend it to general instances in forth
section. As an application, we present an FPT PTAS for capacitated k-means
based on the proposed discretization algorithm in fifth section. Further work are
discussed in this section as well.
A Discretization Algorithm for k-Means with Capacity Constraints 715
2 The Centroid Set and -Approximate Centroid Set

In this paper, we mainly consider the capacitated k-means with object minimiz-
ing the within-cluster sum of squared Euclidean distances. Formally speaking,
we are looking for a partition Π = (S1 , S2 , · · · , Sk ) that for any given n-point
set X integer k ≥ 2, |S| ≤ U are satisfying
⊆ Rd and for all S ∈ Π so as to min-
imize S∈Π s∈S s − c(S)2 , where c(S) = s∈S s/|S| is called the centroid
of cluster S.
Here are some geometric facts
about k-means. For a point c ∈ Rd and a finite
set S ⊆ R , define cost(S, c) = s∈S s − c . A simple observation is
d 2
min cost(S, c) = cost(S, c(S)).

c∈Rd
We simply denote this value by cost(S). When given observation set X ⊆ Rd

and center set C = {c1 , c2 , · · · , ck } ⊆ Rd , define

k
cost(S, C) = min cost(Si , ci ),
Π:=(S1 ,S2 ,··· ,Sk )
i=1
where Π := (S1 , S2 , · · · , Sk ) is a partition of X. We also denote this value by

cost(Π) if it causes no confusion. The second observation is that the optimal
partition w.r.t. the above minimization problem is letting Si be the set of points
in X for which ci is the nearest center in C (ties broken arbitrarily). This is
the well known Voronoi partition. We denote the Voronoi partition of ground
set X according to {c1 , c2 , · · · , ck } by ΠV or (c1 , c2 , · · · , ck ). Note this is in the
uncapacitated case.
Now we are clear that the optimal centroid set is the collection of centroid of
all clusters. From the definition we know, there are exponential many candidates
for every single centroid in k-means. To achieve the purpose of approximating
the centroid set with -approximate centroid set of polynomial size (w.r.t. inputs
except the dimension d), we introduce the following definitions.
Definition 1. Basic radius For any S ⊆ X ⊆ Rd , we define basic radius of S

as 12
cost(S) 1
ρ(S) = = s − c(S) .
|S| |S|
s∈S
Definition 2. -approximate centroid set Given finite set X, C ⊆ Rd , we

call C an -approximate centroid set for X if for any subset S of X, C intersects
the ball centered at c(S) of radius ρ(S)/3.
Next we will show how and why the -approximate centroid set works.
716 Y. Xu et al.
3 After the -Approximate Centroid Set

Now thinking about the k-clustering problem when we are given the centroid set,
as discussed in the beginning, the optimal clustering is the well-known Voronoi
partition. However, it is not the case for capacitated k-clustering since Voronoi
partition may obey the capacity constraints. The following lemma proposes a
way to find the optimal partition in the capacitated version.
Lemma 1. When given centroid set, the k-clustering problem with capacity con-
straints are tractable.
Proof. See Journal version.
Suppose we are doing k-clustering with uniform size constraints L ≤ |S| ≤ U ,

we have the following. Note trivial inputs of L and U imply uncapacitated k-
clusterings.
Lemma 2. Given X ⊆ Rd and C is the -approximate centroid set for X, then

for any k-clustering Π (for X) with cluster size L ≤ |S| ≤ U , there exists
c1 , c2 , · · · , ck ∈ C such that
cost(ΠT P (c1 , c2 , · · · , ck )) ≤ (1 + )cost(Π).
4 The Construction of -Approximate Centroid Set

In this main section, we propose a recursive construct algorithm of -approximate
centroid set for capacitated k-means. We only proceed the construction for well-
separated instances here, for general instances see journal version. By the well-
separated instances, we mean the instances where the maximum distances in the
given observation set are not extremely large with respect to the minimum.
4.1 Well-Separated Instances

In case of confusion, we redefine some notations below.
– Let X be the observation/ground set that will be operated through this
section.
– d := min i − j is the minimum distances of any two distinct point of
i,j∈X,i=j
X.
– Let C be the -approximate centroid set being constructed.
– r is a parameter that guarantees any cluster/subset S with at least two dis-
tinct point of X has ρ(S) ≥ r. Generally r = d/n is sufficient.
– “Cube” means the infinite set of points that form an axis-parallel cube in Rd .
– A p-enlargement (p ≥ 1 and need not be an integer) of a cube Q means the
concentric cube with side length p times than the side length of Q.
Fig. 1. An example of subdivision in 3-dimensional Euclidean space
Definition 3. η-dense set We say A is an η-dense set for A if for any point
of A, there exists a point in A that is within η distance of it.
Given X = {x1 , x2 , · · · , xn } ⊆ Rd and integer lower and upper bounds for

clusters L and U , if U is smaller than a given constant c1 (experimentally we set
d −d
c1 = O(logn ( 2 L

log Rr ))), the optimal centroid set can simply be obtained by
enumeration. Otherwise let QX be a suitable small cube enclosing X. Let Q0 be
the 3-enlargement of QX and R be its side length.
Now we begin the construction with Q0 and it is the initial active cube at
time 0 with C = ∅. And from now on, any active cube labeled active will choose
to be subdivided into 2 equal-size cubes along each coordinate axis (see Fig. 1 as
an example). And in case of contradiction, we force all the cubes to be semi-open
intervals, so that in any subdivision the cubes are disjoint and cover Q.
At any time of the construction, let Q be one of the current active cubes,

and σ be its side length. We choose the intersections of a grid that is 18 σ-dense
set for the 2-enlargement of Q, denoted by CQ and add it to the current set
C. Afterwards the construction goes to a decision step, before this step we call
the current operated cubes undetermined cubes. If σ ≥ 2r, we divide Q into 2d
cubes with side length d/2. And in next step, some of the 2d cubes (we only
take the sub-cubes of Q as examples) will be labeled active if it contains at least
L/2d+1 points of X. Repeat the above until no active cubes left. If L allows to
be trivial (L = 1) or X is a multi-set, we add X to C as final step.
Theorem 1. In the case that that the maximum distances in X are

not extremely large w.r.t. the minimum distances, we can in O((n +
n−d ) log(R/r)) time construct an -approximate centroid set of size
min{O(nU ), O(n2d −d /L log(R/r))} for X with cluster size within interval
[L, U ].
4.2 General Instances
We only present the theorem for general instances, more detail see journal version
of this paper.
718 Y. Xu et al.
Algorithm 1. Framework of -approximation centroid set construction for well-

separated instances
Input: n-point set of observations X ⊆ Rd , integer k ≥ 2, cluster size capacity integer
L, U , small > 0, parameter c, r
Output: -approximation centroid set C.
1: Set C = ∅;
2: if U ≤ c1 , compute C by enumerate all centroid of subsets of X whose size is
not larger than U ;
3: Otherwise compute QX (small cube enclosing X) and its 3-enlargement Q0 ,
label Q0 active;
4: for any active cube Q (with side length σ)
5: C ← C ∪ CQ ;
6: if σ ≥ 2r
7: subdivide Q into identical cubes Q1 , · · · , Q2d ;
8: for i = 1, · · · , 2d
L
9: if |Qi ∩ X| ≥ 2d+1
10: label Qi active;
11: otherwise
12: break;
13: end if
14: end for
15: end if
16: end for
17: C ← C ∪ X;
18: end if
return C.
Theorem 2. For an arbitrary n-point set X, we can in O(n log n + n−d

log(1/)) time construct an -approximate centroid set of size min{O(nU ),
n −d
O( L log(1/))} for X with cluster size within interval [L, U ].
5 Naive FPT(k,d) PTAS: A Simple Application

Therefore, for general capacitated k-means input X, we can compute the -
n −d
approximate centroid set of size min{O(nU ), O( L log(1/))} in O(n log n +
−d
n log(1/)) time. Note with fixed dimension d, both the size and the pro-
ceeding time are polynomial. In Lemma 2, we prove the existence of -optimal
centroid set of size k. When k is not very large or, equivalently speaking, is a fixed
constant, we are allowed to search a subset of size k within the -approximate
centroid set. For each subset what we need to do is to solve a linear program
(Lemma 1). The one with minimum return is a (1 + )-approximate capacitated
k-means clustering of X, implying the naive PTAS.
Furthermore, another important application of the proposed discretization
algorithm is to allow more combinatorial techniques possible to solve capacitated
k-means. And we believe this result also holds for a large number of capacitated
clusterings with object minimizing the sum of q-th power Euclidean distances.
We hope this is a significant step towards constant approximations for capaci-

tated k-clusterings without violating either capacity or cardinality constraints.
Acknowledgement. The first author is supported by China Postdoctoral Science

Foundation funded project (No. 2018M643233). The second author is supported by
Natural Science Foundation of China(Nos. 11531014, 11871081). The third author
is supported by Higher Educational Science and Technology Program of Shandong
Province (No. J15LN23). The fourth author is supported by Natural Science Founda-
tion of China (Nos. 61433012, U1435215).
References
1. An, H.C., Bhaskara, A., Chekuri, C., Gupta, S., Madan, V., Svensson, O.: Cen-
trality of trees for capacitated k-center. Math. Program. 154(1–2), 29–53 (2015)
2. Byrka J., Fleszar K., Rybicki B., Spoerhase J.: Bi-factor approximation algorithms
for hard capacitated k-median problems. In: Proceedings of the 26th Annual ACM-
SIAM Symposium on Discrete Algorithms, pp. 722–736. SIAM, San Diego, USA
(2015)
3. Chen X., Cai D.: Large scale spectral clustering with landmark-based representa-
tion. In: Proceedings of the 25th AAAI Conference on Artificial Intelligence, pp.
313–318. AAAI, San Francisco, USA (2011)
4. Cohen-Addad V., Klein P.N., Mathieu C.: Local search yields approximation
schemes for k-means and k-median in Euclidean and minor-free metrics. In: Pro-
ceedings of the 57th IEEE Annual Symposium on Foundations of Computer Sci-
ence, pp. 353–364. IEEE, New Brunswick, USA (2016)
5. Friggstad Z., Rezapour M., Salavatipour M.R.: Local search yields a PTAS for k-
means in doubling metrics. In: Proceedings of the 57th IEEE Annual Symposium
on Foundations of Computer Science, pp. 365–374. IEEE, New Brunswick, USA
(2016)
6. Geetha S., Poonthalir G., Vanathi P.: Improved k-means algorithm for capacitated
clustering problem. In: Proceedings of the 28th IEEE Conference on Computer
Communications, pp. 52–59. IEEE, Rio de Janeiro, Brazil (2009)
7. Kanungo, T., Mount, D.M., Netanyahu, N.S., Piatko, C.D., Silverman, R., Wu,
A.Y.: A local search approximation algorithm for k-means clustering. Comput.
Geom. 28(2–3), 89–112 (2004)
8. Li, S.: On uniform capacitated k-median beyond the natural LP Relaxation. ACM
Trans. Algorithms 13(2), 1–22 (2017)
9. Matoušek, J.: On approximate geometric k-clustering. Discret. Comput. Geom.
24(1), 61–84 (2000)
10. Shen X., Liu W., Tsang I., Shen F., Sun Q.: Compressed k-means for large-scale
clustering. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence,
pp. 2527–2533. AAAI, San Francisco, USA (2017)
A Gray-Box Approach for Curriculum
Learning
Francesco Foglino1 , Matteo Leonetti1 , Simone Sagratella2(B) ,

and Ruggiero Seccia2
1
School of Computing, University of Leeds, Leeds, UK
{scff,M.Leonetti}@leeds.ac.uk
2
Department of Computer, Control and Management Engineering Antonio Ruberti,
Sapienza, University of Rome, Via Ariosto 25, 00185 Roma, Italy
{sagratella,seccia}@diag.uniroma1.it
Abstract. Curriculum learning is often employed in deep reinforcement

learning to let the agent progress more quickly towards better behaviors.
Numerical methods for curriculum learning in the literature provides only
initial heuristic solutions, with little to no guarantee on their quality.
We define a new gray-box function that, including a suitable scheduling
problem, can be effectively used to reformulate the curriculum learning
problem. We propose different efficient numerical methods to address
this gray-box reformulation. Preliminary numerical results on a bench-
mark task in the curriculum learning literature show the viability of the
proposed approach.
Keywords: Curriculum learning · Reinforcement learning

Black-box optimization · Scheduling problem
1 Introduction
Curriculum learning is gaining popularity in (deep) reinforcement learning, see

e.g. [8] and references therein. It can provide better exploration policies through
transfer and generalization from less complex tasks. Specifically, curriculum
learning is often employed to let the agent progress more quickly towards bet-
ter behaviors, thus having the potential to greatly increase the quality of the
behavior discovered by the agent. However, at the moment, creating an appro-
priate curriculum requires significant human intuition, e.g. curricula are mostly
designed by hand. Moreover, current methods for automatic task sequencing
for curriculum learning in reinforcement learning provided only initial heuristic
solutions, with little to no guarantee on their quality.
After a brief introduction to reinforcement learning, see e.g. [10,11], we define
the curriculum learning problem. This is an optimization problem that cannot
be solved with standard methods for nonlinear programming or with derivative-
free algorithms. We define a new gray-box function that, including a suitable

https://doi.org/10.1007/978-3-030-21803-4_72
A Gray-Box Approach for Curriculum Learning 721
scheduling problem, can be effectively used to reformulate the curriculum learn-

ing problem. This gray-box reformulation can be addressed in different ways.
We investigate both heuristics to estimate approximate solutions of the gray-
box problem, and derivative-free algorithms to optimize it. Finally, preliminary
numerical results on a benchmark task in the curriculum learning literature
show that the proposed gray-box methods can be efficiently used to address the
curriculum learning problem.
2 Reinforcement Learning Background

Consider an agent that acts in an environment m according to a policy π. The
policy π is a function that given a state s ∈ S and a possible action a ∈ A returns
a number in [0, 1] representing the probability that the agent executes the action
a in state s. The environment m is modeled as an episodic Markov Decision
Process (MDP), that is, a tuple (S, A, pm , rm , Tm , γm ), where both S ⊂ Rd and
A ⊂ Rd are nonempty and finite sets, for which d > 0 is the state dimension,
pm : S × A → S is a transition function, rm : S × A → R is a reward function,
Tm ∈ N is the maximum length of an episode, and γm ∈ [0, 1] is a parameter
used to discount the rewards during each episode. At every time step t, the agent
perceives the state st , chooses an action at according to π, and the environment
transitions to state st+1 = pm (st , at ). We assume for simplicity for the transition
function pm (st , at ) PS (st + at ), where PS denotes the projection operator over
the set S. Every episode starts at state s0 ∈ S, where s0 can depend both on the
environment m and the episode. During an episode, the agent receives a total
reward of
m −1
T
Rm (s0 , . . . , sTm −1 , a0 , . . . , aTm −1 ) (γm )t rm (st , at ).
t=0
Note that γm is used to emphasize the rewards that occur early during an
episode. We say that st is an absorbing state if st = st for all t ≥ t, and
rm (st , a) = 0 for any action a ∈ A, that is, the state can never be left, and
from that point on the agent receives a reward of 0. Absorbing states effectively
terminate an episode before the maximum number of time steps Tm is reached.
The policy function π is obtained from an estimate qπ of the value function
⎡ ⎤
m −1
T
qπ (s, a) E ⎣ (γm )j−t rm (sj , aj ) : st = s, at = a⎦ ,
j=t
for any state s ∈ S and action a ∈ A. The value function is the expected reward
for taking action a in state s at any possible time step t and following π thereafter
until the end of the episode. We linearly approximate the value function qπ in a
parameter θ ∈ D ⊂ RK :

K
qπ (s, a; θ) θk φk (s, a),
k=1
722 F. Foglino et al.
where φk are suitable basis functions mapping the pair (s, a) into R. The policy
function π for any point (s, a) ∈ S × A and any parameter θ ∈ D, is given by
qπ (s, a; θ)
π(s, a; θ) .
qπ (s, α; θ)
α∈A
During the reinforcement learning process the policy π is optimized by vary-

ing the parameter θ over D in order to obtain greater values of the environment
specific total reward Rm . In this respect, we introduce the black-box function
ψm : RK → R, which takes the parameter θ and returns the expected total
reward E[Rm ] obtained with the policy π( · , · ; θ). It is reasonable to assume
that ψm is bounded from above over D. Then, a global optimal policy for the
environment m is given by θ ∈ D satisfying
≥ ψm (θ),
ψm (θ) ∀ θ ∈ D. (1)
In practical reinforcement learning optimization, at any time step t of a finite

number Nm of episodes the policy parameter θ is updated by using a learning
algorithm that exploits the value of both the reward rm (st , at ) and the current
estimate of the value function qπ , aiming at computing a point θ satisfying
(1). Certainly, the better the point θ ∈ D from which the learning procedure
starts, the faster the global optimum is achieved. In general, due to the limited
number Tm Nm of iterations granted, we say that the learning algorithm is able
to compute a local optimal θ ∈ D satisfying
≥ ψm (θ),
ψm (θ) ∀ θ ∈ D : θ − θ < ζm , (2)
where ζm > 0 is related to Tm Nm and θ ∈ D is the starting guess.
3 The Curriculum Learning Problem
We want the agent to quickly obtain great values of ψmL in a specific environment
mL that we call the final task. To do this, it is crucial to ensure that the
reinforcement learning phase in the final task mL starts from a good initial
point θL ideally close to a global maximum of ψmL over D. Curriculum learning
is actually a way to obtain a good starting point θL computed by sequentially
learning the policy on a subset of possible tasks (i.e. environments) different
from the final task mL , see e.g. [8] and references therein. The curriculum
c = (m0 , . . . , mL−1 ) is the sequence of these tasks in which the policy of the
agent is optimized before addressing the final task mL . Specifically, given a
starting θ0 ∈ D, the point θ1 is obtained by (approximately) maximizing ψm0
over {θ ∈ D : θ − θ0 < ζm0 }, the point θ2 is obtained by (approximately)
maximizing ψm1 over {θ ∈ D : θ − θ1 < ζm1 }, and so on. At the end of this
process we get a point θL ready to be used as starting guess for the optimization
of the policy in the final task mL . Clearly, the obtained θL depends on the
specific sequence of tasks in the curriculum c. To underline this dependence, we

write θL (c).
We denote with T the set of n available tasks. The tasks in T must be
included in the curriculum c of length less than L ≤ n in a specific order and
without repetitions. The quality of the curriculum c is given by ψmL (θL+1 (c))
that is obtained by executing learning updates with respect to ψmL for a finite
number NmL of episodes and starting from θL (c). A practical performance metric
of great interest is given by the so called regret function, which takes into
account both the expected total reward that is obtained for the final task at the
end of the learning process, and how fast it is achieved:

NmL
Pr (c) g − ψmL θL+(i/NmL ) (c) ,
i=1
where g is a given good performance threshold (which can be the total reward
obtained with the optimal policy when known), and θL+(i/NmL ) (c) is the point
obtained with the learning algorithm at the end of the ith episode. Given the
curriculum c, the function Pr (c) sums the gaps between the threshold g and the
total reward actually achieved at every episode. Clearly the aim is to minimize
it
minimize Pr (c), (3)
c∈C
where C is the set of all feasible curricula obtained from T .

Problem (3) presents two main drawbacks: (i) having a black-box nature, its
objective function has not an explicit definition and it is in general nonsmooth,
nonconvex, and even discontinuous; (ii) it is a constrained optimization prob-
lem, whose feasible set is combinatorial. With the aim of solving problem (3),
drawback (i) does not allow us to resort to methods for general Mixed-Integer
NonLinear Programs (MINLP), see e.g. [2], while (ii) makes it difficult to use
standard Derivative-Free (DF) methods, see e.g. [6,7]. See [8] and the references
therein for possible numerical procedures to tackle problem (3). As we show in
Sect. 6, the methods proposed in [8] constitute only a preliminary step in order
to solve efficiently the curriculum learning problem.
In the next section we define a new gray-box reformulation for problem (3)
that incorporates a scheduling problem. Afterwards, we propose different prac-
tical techniques to address this gray-box reformulation.
4 The Scheduling Problem to Minimize Regret

Let us introduce the variables δ ∈ {0, 1}n and γ ∈ {0, 1}n×(n−1) . Any δi indicates
the presence of the ith task of T in the curriculum c, specifically, δi = 1 if and
only if the ith task of T is in the curriculum c. Any γij , with i = j, is an indicator
variable used to model the order of the task in the curriculum: γij = 1 if and
only if the ith task of T is in the curriculum c and it is scheduled before the jth
task of T . All the tasks not included in the curriculum are considered scheduled
after all the ones included.
Minimizing the regret Pr is equivalent to maximizing the merit function U

given by
NmL

U (c) ψmL θL+(i/NmL ) (c) .
i=1
We make the following assumption:
(A1) Every task mi in c contributes to the value of U with a fixed individual

utility ui ≥ 0. Moreover, considering all pairs (i, j) ∈ {1, . . . , n} × {1, . . . , n}
with i = j, if the ith task of T is in the curriculum c and it is scheduled
before the jth task of T , then there is a penalty in U equal to pij ≥ 0.
This concept of penalty in assumption (A1) is useful to model the fact that a
task mj can be preparatory for another task mi . In this sense, if the policy is
not optimized in the preparatory task mj before it is optimized in task mi , then
the utility given by task mi has to be reduced by the corresponding penalty.
We intend to approximate U with the following function that is linear with
respect to (δ, γ):

n
n
n
(δ, γ; u, p)
U u i δi − pij γij .
i=1 i=1 i=j=1
If assumption (A1) holds, then certainly U is a good approximation of U . In

general cases, given the utilities u and the penalties p, our idea is to maximize U
by modifying the indicator variables δ and γ corresponding to feasible curricula
in C. We introduce additional variables x ∈ [0, L − 1]n ∩ Zn indicating the order
of the tasks in the curriculum c; if the ith task of T is not in c then xi = L − 1.
We are ready to define the scheduling problem for curriculum learning.
maximize (δ, γ; u, p)
U
x,δ,γ
subject to xi ≥ (L − 1)(1 − δi ), i = 1, . . . , n
xi + δj ≤ xj + Lγji , i = 1, . . . , n, j = 1, . . . , n, i = j (4)
γij + γji ≤ 1, i = 1, . . . , n, j = 1, . . . , n, i = j
x ∈ [0, L − 1]n ∩ Zn , δ ∈ {0, 1}n , γ ∈ {0, 1}n×(n−1) .
Problem (4) is an Integer Linear Program (ILP) that can be solved by resorting
to many algorithms in the literature.
The following properties hold:
– Let ( γ
x, δ, ) be an optimal point of the scheduling problem (4) with (u, p) ∈
n×(n−1)
R+ ×R+
n
. Let 0, . . . , m
c = (m L−1 ) be such that, for all j ∈ {0, . . . , L−1},
j = m ∈ T with x
m ord(m) = j and δord(m) = 1, where the operator ord(m)
returns the index of the task m in T . Then c ∈ C, i.e.
c is a feasible curriculum.
– Let
c = (m 0, . . . , m
L−1 ) be any curriculum in C, then parameters ( u, p) ∈
n×(n−1)
R+ × R+
n
exist such that solving problem (4) with (u, p) = ( u, p) gives
such that x
x ord(m j) = j and
δ j)
ord(m = 1, for all j ∈ {0, . . . , L − 1}. That is,
any curriculum in C can be computed by solving problem (4) with suitable
parameters (u, p).
We introduce the gray-box function Ψ : Rn×n → R, which takes the parameters

(u, p), computes a curriculum c by solving problem (4) with parameters (u, p),
and returns the regret Pr (c). By using the gray-box function Ψ , problem (3) can
be equivalently reformulated as
minimize Ψ (u, p). (5)

n×(n−1)
(u,p)∈Rn
+ ×R+
5 Numerical Methods for the Gray-Box
The gray-box function Ψ can be used in different ways in order to solve the
curriculum learning problem efficiently. Here we consider three of them.
– Problem (5) is a black-box optimization problem whose feasible set includes

only lower bounds. Therefore we can resort to many DF algorithms in order
to compute (approximate) optimal points of (5). A potential solution is rep-
resented by Sequential Model-Based Optimization (SMBO) methods which
consider the information obtained by all the previous iterations to build a sur-
rogate probabilistic model of Ψ (u, p). At each iteration a new point is drawn
by maximizing an acquisition function and the information gained with this
new sample is used to update the surrogate model [9,13,14].
– We can compute a good estimate for (u, p) and then evaluate Ψ (u, p) in order
to have a good value of the regret.
– We can use a good estimate for (u, p) as a reference point to define a trust
region for the feasible set of problem (5). The resulting furtherly constrained
black-box optimization problem can be solved with many DF algorithms such
as a Tree-structured Parzen Estimator (TPE), see e.g. [5], which allows us to
define a distribution of probability of the parameters (u, p) to optimize.
Computing a good estimate for (u, p) can be critical for obtaining good numerical
performances. Here we propose a method that is justified by the assumption
(A1). In that, if the assumption (A1) holds, then we have for any (i, j) with
i = j:

n
n
U (mi , mj ) = ui + uj − pik − pjk + U ,
k=1, k=i k=1, j=k=i

n
n
U (mi ) = ui − pik + U , U (mj ) = uj − pjk + U ,
k=1, k=i k=1, k=j
where U is an unknown constant. That implies
pji = U (mi , mj ) − U (mi ) − U (mj ) + U , (6)

n
ui = U (mi ) + pik − U
k=1, k=i
n
= U (mi ) + (U (mk , mi ) − U (mk ) − U (mi )) + (n − 2)U . (7)
k=1, k=i
We observe that computing this estimate requires n2 evaluations of U .

In the following section we adapt these ideas to a benchmark task in the
curriculum learning literature.
6 Experimental Evaluation
In order to evaluate the effectiveness of the proposed framework, we implemented
it on the GridWorld domain. In this section, we describe the GridWorld’s setting
and all the libraries adopted for the definition of the framework.
6.1 GridWorld
GridWorld is an implementation of an episodic grid-world domain used in the
evaluation of existing curriculum learning methods, see e.g. [15]. Each cell can
be free, or occupied by a fire, pit, or treasure. The aim of the game is to find the
treasure in the least number of possible episodes, avoiding both fires and pits.
An example of GridWorld is shown in Fig. 1.
States S: The state is given by the agent position, that is d = 2.
Actions A and transition function pm : The agent can move in the four
cardinal directions, and the actions are deterministic.
Reward function rm : The reward is −2500 for entering a pit, −500 for entering
a fire, −250 for entering the cell next to a fire, and 200 for entering a cell
with the treasure. The reward is −1 in all other cases.
Episodes length Tm , absorbing states, discount parameter γm : All the
episodes terminate under one of these three conditions: the agent falls into a
pit, reaches the treasure, or executes a maximum number of actions (Tm =
50). We use γm = 0.99.
Basis functions φk : The variables fed to tile coding are the distance from,
and relative position of, the treasure (which is global and fulfills the Markov
property), and distance from, and relative position of, any pit or fire within a
radius of 2 cells from the agent (which are local variables, and allow the agent
to learn how to deal with these objects when they are close, and transfer this
knowledge from a task to another).
We consider tasks of dimensions similar to Fig. 1 and with a variable number of
fires and pits. The number of episodes for all the tasks is the same.
Fig. 1. An example of GridWorld.
6.2 Algorithms and Implementation Details
We analyse different optimization techniques to solve the curriculum learning

problem. In particular, we compare five different methods:
– C0 : where no curriculum learning is performed, i.e. c = ∅, but the agent is

trained directly to solve the final task mL with starting point θL = 0K .
– GREEDY Par: Greedy algorithm which constructs the curriculum incre-
mentally by considering at each iteration the n tasks which mostly improve
the final performance [8]. This is used as benchmark.
– GP: where problem (5) is modeled through a Gaussian Process and new
points are drawn by maximizing an acquisition function, the Expected
Improvement (EI), with a BFGS method (GPyOpt library). Since it is used
without incorporating any a priori knowledge, it searches for the best values
(u, p) on the box [0, 1000]n × [0, 100]n×(n−1) .
– Heuristic where a good estimate for (u, p) is computed through formulas
(6) and (7), with U calculated such that min(i,j) pij ≥ 0 and mini ui ≥
10 max(i,j) pij .
– TPE: where the surrogate model of problem (5) is defined by a Tree-
structured Parzen Estimator and new points are drawn by maximizing the
EI (hyperopt library). It is used as a local-search method by defining the
distribution of (u, p) as a gaussian distribution centered in the values (u, p)
returned by the heuristic and with a variance proportional to the mean of
(u, p) respectively.
The proposed framework is implemented in Python 3.6 on a Intel(R) Core(TM)

i7-3630QM CPU 2.4GHz. by means of the following libraries:
docplex (v 2.8.125): version of Cplex used for solving the ILP (4). We set the
running time to 60 s per iteration and the mipgap to 10−2 .
GPyOpt (v 1.2.5): used as black-box optimization algorithm for solving prob-
lem (5) when no information on good estimates of (u, p) is available. It is
a Sequential Model Based Optimization (SMBO) algorithm where the sur-
rogate function is defined through a Gaussian Process and the new point is
determined by the maximization of the EI [1,12].
hyperopt (v 0.2): used as black-box optimization algorithm for solving prob-

lem (5) when a good estimate of (u, p) is available. It is an SMBO method
where the surrogate function is defined by a Tree-structured Parzen Estima-
tor and the new point is determined as in the previous case by maximizing
the acquisition function [3–5].
Burlap: used for the implementation of the GridWord domain along with the
Sarsa(λ) code as learning algorithm to update the policy and Tile Coding
as the function approximator (http://burlap.cs.brown.edu).

We consider two different experiments on the GridWorld domain. In the first
example, we define n = 12 different tasks and we impose that at maximum
L = 4 tasks of them can be performed, obtaining 13345 potential curricula. For
this example we set Nm = 300. In the second case, n = 7 tasks are defined and
all of them can be considered in the same curriculum L = 7, for a total of 13700
possible combinations of tasks. For this example we set Nm = 400. See [8] for
futher details about these examples.
Algorithm C0 requires 1 curriculum evaluation, i.e. call of Pr , while Heuristic
needs n2 curriculum evaluations. The number of curriculum evaluations granted
to the other algorithms is 300.
In Table 1 for each algorithm we report:
– the best value of the regret found (Pr )
– the ranking of the returned solution with respect to all the possible curricula
(rank)
Table 1. Results obtained on GridWorld domain problem (Pr∗ indicates the regret
obtained with the optimal policy).
n = 12, L = 4 n = 7, L = 7
Algorithm Pr Rank Pr Rank
C0 −0,6389 11499 −0,5051 4535
GREEDY Par −0,7765 144 −0,6113 260
GP −0,7882 32 −0,6511 38
Heuristic −0,7773 121 −0,5966 417
TPE −0,8025 4 −0,6697 14
Pr∗ : −0, 8149, |C| = 13345 Pr∗ : −0, 7224, |C| = 13700
From the numerical results, it is evident how all the proposed optimization meth-
ods based on the gray-box are able to improve the performance value Pr obtained
when training the agent directly on the final task (algorithm C0 ). As a proof of
the effectiveness of the proposed heuristic method from (6) and (7), we highlight
how this procedure is always able to find better solutions than C0 and similar
solutions to those returned by GREEDY Par. Moreover, the definition of a sur-
rogate function through a Gaussian Process seems to be a successful choice in
order to further improve the solution found. Finally, the local search performed
by TPE around the tentative point (u, p) leads to a remarkable improvement
of the final performance by finding, in both the two scenarios, one of 15th best
solutions out of the more than 13000 possible curricula.
References
1. Gpyopt: a bayesian optimization framework in python. http://github.com/
SheffieldML/GPyOpt (2016)
2. Belotti, P., Kirches, C., Leyffer, S., Linderoth, J., Luedtke, J., Mahajan, A.: Mixed-
integer nonlinear optimization. Acta Numer. 22, 1–131 (2013)
3. Bergstra, J.: Hyperopt: distributed asynchronous hyperparameter optimization in
python (2013)
4. Bergstra, J., Yamins, D., Cox, D.D.: Making a science of model search: hyperpa-
rameter optimization in hundreds of dimensions for vision architectures (2013)
5. Bergstra, J.S., Bardenet, R., Bengio, Y., Kégl, B.: Algorithms for hyper-parameter
optimization. In: Advances in Neural Information Processing Systems, pp. 2546–
2554 (2011)
6. Custódio, A.L., Scheinberg, K., Nunes Vicente, L.: Methodologies and software
for derivative-free optimization. In: Advances and Trends in Optimization with
Engineering Applications, pp. 495–506 (2017)
7. Di Pillo, G., Liuzzi, G., Lucidi, S., Piccialli, V., Rinaldi, F.: A DIRECT-type app-
roach for derivative-free constrained global optimization. Comput. Optim. Appl.
65(2), 361–397 (2016)
8. Foglino, F., Leonetti, M.: An optimization framework for task sequencing in cur-
riculum learning (2019). arXiv preprint arXiv:1901.11478
9. Frazier, P.I.: A tutorial on bayesian optimization (2018). arXiv preprint
arXiv:1807.02811
10. Leonetti, M., Kormushev, P., Sagratella, S.: Combining local and global direct
derivative-free optimization for reinforcement learning. Cybern. Inf. Technol.
12(3), 53–65 (2012)
11. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G.,
Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level
control through deep reinforcement learning. Nature 518(7540), 529 (2015)
12. Rasmussen, C.E.: Gaussian processes in machine learning. In: Advanced Lectures
on Machine Learning, pp. 63–71. Springer (2004)
13. Shahriari, B., Swersky, K., Wang, Z., Adams, R.P., De Freitas, N.: Taking the
human out of the loop: a review of bayesian optimization. Proc. IEEE 104(1),
148–175 (2016)
14. Snoek, J., Larochelle, H., Adams, R.P.: Practical bayesian optimization of machine
learning algorithms. In: Advances in Neural Information Processing Systems, pp.
2951–2959 (2012)
15. Svetlik, M., Leonetti, M., Sinapov, J., Shah, R., Walker, N., Stone, P.: Automatic
curriculum graph generation for reinforcement learning agents. In: AAAI, pp. 2590–
2596 (2017)
A Study on Graph-Structured Recurrent
Neural Networks and Sparsification with
Application to Epidemic Forecasting
Zhijian Li1 , Xiyang Luo2 , Bao Wang2 , Andrea L. Bertozzi2 , and Jack Xin1(B)
1
UC Irvine, Irvine, CA, USA
zhijil2@uci.edu, jxin@math.uci.edu
2
UCLA, Los Angeles, CA, USA
xylmath@gmail.com, wangbaonj@gmail.com, bertozzi@math.ucla.edu
Abstract. We study epidemic forecasting on real-world health data

by a graph-structured recurrent neural network (GSRNN). We achieve
state-of-the-art forecasting accuracy on the benchmark CDC dataset.
To improve model efficiency, we sparsify the network weights via a
transformed-1 penalty without losing prediction accuracy in numerical
experiments.
Keywords: Spatio-temporal data · Spatio-temporal graph ·

Graph structured recurrent neural network · Epidemic forecasting ·
Sparsification
1 Introduction
Epidemic forecasting has been studied for decades [8]. Many statistical and
machine learning methods have been successfully used to detect epidemic out-
breaks [5]. In previous works, epidemic forecasting is mainly considered as a time-
series problem. Time-series methods, such as Auto-Regression (AR), Long Short-
term Memory (LSTM) neural networks and their variants have been applied to
this problem. One of the current directions is to use social media data [9]. In
2008, Google launched Google Flu Trend, a digital service to predict influenza
outbreaks using Google search data. The Google algorithm was discontinued
due to flaws, however Yang et al. [13] designed another algorithm ARGO in
2015 also using Google search pattern data. Google Correlate, a collection of
time-series data of Google search trends, plays a vital role in this refined regres-
sion algorithm. Though ARGO succeeded in accuracy as a time series algorithm,
it lacks spatial structure and requires the additional input of external features
(e.g., social media data). The infectious and spreading nature of the epidemics
suggests that forecasting is also a spatial problem. Here we study a model to
take advantage of the spatial information so that the data from the adjacent
regions can introduce regional spatial features. This way, we minimize external

https://doi.org/10.1007/978-3-030-21803-4_73
Graph-Structured RNN and Sparsification on Epidemic Forecasting 731
data input and the accompanying computational cost. Structured recurrent neu-
ral network (SRNN) is a model for the spatial-temporal problem first adopted
by Jain et al. [4] for motion forecasting in computer vision. Wang et al. [10–12]
successfully adapted SRNN to forecast real-time crime activities. Motivated by
[4,10,11], we present an SRNN model to forecast epidemic activity levels. We test
our model with data provided by the Center for Disease Control (CDC), which
collects data from approximately 100 public and 300 private laboratories in the
US [1]. The CDC data [1] is a well-established authoritative data set widely used
by researchers, which makes it easy for us to compare our model with previous
work. CDC provides the influenza data by the geography of Health and Human
Services regions (HHS regions). We take the geographic structure of ten HHS
regions as our spatial information. The rest of the paper is organized as fol-
lows. In Sect. 2, we overview RNN. In Sects. 3–5, we present a graph-structured
RNN model, graph description of spatial correlations, and sparsity promoting
penalties. Experimental results and concluding remarks are in Sects. 6 and 7.
2 A Short Review of Recurrent Neural Network

Recurrent Neural Network (RNN) is a neural network designed for sequential
data. The idea of RNN comes from unfolding a recursive computation for a chain
of states. If we have a chain of states, in which each state depends on the last
steps: sn = f (sn−1 ), for some function f . Then we can unfold this equation to:
sn = f (f (...f (s0 ))). Suppose we have a sequential data x1 , x2 , ...., xn , the idea
of RNN is to unfold the xn = f (xn−1 , θ) to a computational graph. An unfolded
RNN is illustrated in Fig. 1 and given by the recursion:
yˆt unfold ˆ
yt−1 yˆt ˆ
yt+1
V V V V
W W W
W ht ht−1 ht h+1
U U U U
xt xt−1 xt xt+1
Fig. 1. An unfolded recurrent neural network.
ht = tanh(b + W ht−1 + U xt ), yt = tanh(V ht + c),

where tanh is the activation function; (U, V, W ) are the weight matrices; b, c
are bias vectors. Given yit the true signal at time t, the popular loss function
for classification
task is the cross-entropy loss, which reads in the binary case:
L(θ) = − t yt · ln(yˆt ) + (1 − y) ln(1 −
ŷ). The mean-square-error loss is widely
used for regression problem: L(θ) = t (yt − yˆt ) . Then, as in most neural
2
732 Z. Li et al.
networks, RNN is trained by stochastic gradient descent. A major issue of RNN

is the problem of exploding and vanishing gradients. Since
∂L ∂Lt (y t , yˆt ) ∂Lt

t
∂Ln ∂ yˆt ∂st ∂sk
= , = , s = tanh(U xt + W ht−1 ),
∂W ∂W ∂W ∂ ˆt ∂st ∂sk ∂W t
y
t k=0
∂Lt ∂ yˆt ∂sj ∂sk

t
∂Lt
then = .
∂W ∂ yˆt ∂st j=k+1 ∂sj−1 ∂W
It is well-known [7] that
∂ yˆt ∂si+1
t−1
∂st
|| || ≤ η t−k || ||
∂st ∂si ∂ yˆt
i=k
where η < 1 under the assumption of no bias is used and the spectral norm of W
being less than 1. We see that the gradient vanishes exponentially fast in large t.
Hence, the RNN is learning less and less as time goes by LSTM [3], is a special
kind of RNN that resolves this problem.
ft σ(W [ht−1,xt ]) bf
it σ(W [ht−1,xt ]) bi
= +
ot σ(W [ht−1,xt ]) bo
C̃t tanh(W [ht−1,xt ]) bC
Ct = ft ∗ Ct−1 + it ∗ C̃t , ht = ot ∗ tanh(Ct ).
Since one does not directly apply the same recurrent function to ht every time
step in the gradient flow, there is no intrinsic factor η in ∂Lt
∂W . This way the
gradient has much less chance to vanish as time goes by. In our model, we use
LSTM for all RNNs.
3 Graph-Strutured RNN Model

Similar to previous work of structured RNN, we partition the nodes into different
classes, and for each class we join the nodes in the class. We compare the level
of activity of nodes by summing up the data of each node. Then, we partition
the nodes based on their activity level, from the class that has highest activity
level to the lowest one. After some experiments, we find that SRNN works the
best when we have two classes (see Fig. 2). We denote the class with relatively
high activity level H, and the other class L. After some experiments, we classify
the nodes based on following criteria: Let G be a weighted graph with nodes
indexed by Z = {1, . . . , N }, and edge weights wij ≥ 0. Let g : Z → {1, . . . , C}
be the function that assigns each node to its corresponding group, and assume
for simplicity that g is a surjection. Let us define:
|v| = sum of the historical activity level of node v,

|v| − m
M = arg max |v|, m = arg min |v|, g(v) = .
v v M − m + 10−6
In our model, nodes with label 0 are in the relatively inactive class, nodes with
label 1 or higher belong to another class, the relatively active class.
We define an RNN Ei,j for each connected edge wij = 0. We denote Ei,j as the
edge RNN since it models the pairwise interaction between two connected nodes.
We enforce weight sharing among two edge RNNs, RNNEi ,j and RNNEi,j , if
g(i) = g(i ), g(j) = g(j ), i.e., if the class assignments of the two node pairs are
the same. Similarly, we define an RNN Ni , for each node in Z, which we denote
as a node RNN, and apply weight sharing if g(i ) = g(i). Even though the RNNs
share weights, there state vector are still different, and thus we denote them with
distinct indices.
Let {vit , i ∈ 1 . . . N } be the set of node features at time t. The GSRNN
makes a prediction at node i, time t by first feeding neighboring features to its
respective edge RNN, and then feeding the averaged output along with the node
features to the respective node RNN. Namely,

fit = wij RNNEi,j (vit , αij vjt ), ŷit = RNNNi (vit , fit ). (1)
j
Let yit be the true signal at time t. We use the mean square loss function below:
1 t
Lt (Θ) = (ŷi − yit )2 . (2)
N i
We back-propagate through time (BPTT), a standard method for training

RNNs, with the understanding that the weights for edge RNNs and node RNNs
are shared according to the description above.
etv1 v3 v3t
v1t
v5t v6t
relatively active
class
v2t
etu1 v5 v4t
ut3
ut1
ut4
relatively inactive
class
etu1 u2
ut2
Fig. 2. red edges are of type H-L, green edges are of type L-L, and blue edges are of
type H-H.
734 Z. Li et al.
In our model of C = 2, we have three types of edges, H-H, L-L, and H-L.
The H-H is the type of edge between two nodes in class H, L-L is the type of
edge between two nodes of class L, and H-L is the type of edge a node of class
H and a node of class L. Each type of edge features will be fed into a different
RNN. We normalize our edge weight by maximum degree. Each edge has weight
αij = wij = M1e , ∀ i and j, where Me is the maximum degree over the ten nodes.
We use a look-back window of two to generate training data for RNN: the node
feature of v t contains the information of node v at t − 1 and t − 2. Then, the
edge features of a node v ∈ H with the edges Ev are:
t
v1 v2t
etv,H = , ···
Me Me
for all vi ∈ H such that (v, vi ) ∈ Ev ,

t
u1 ut2
etv,L = , ···
Me Me
for all ui ∈ L such that (v, ui ) ∈ Ev . We feed etv,H and etv,L into the corresponding
edgeRNNs:
1 1
ft = edgeRNNH−L (v t , etv,L ), htv = edgeRNNH−H (v t , etv,H ).
Me Me
Each edge RNN will jointly train all the nodes that have an edge belong to its
type:
1
arg min LH−L (Θ) = (yw t − ŷw
t 2
)
θ |Nw | t
w∈Nw
where Nw = {w ∈ H ∪ L| Ew contains an element of type L-H}.

1 t
arg min LH−H (Θ) = (yv − ŷvt )2
θ |Nv | t
v∈Nv
where Nv = {v ∈ H| Ev contains an element of type H-H}.

Finally, we have a node RNN that jointly trains all the nodes in this class:
1 t
arg min LH (Θ) = (yv − ŷvt )2 , ∀v ∈ H.
θ |H| t
v∈H
We feed the outputs of two edge RNNs, together the node feature of v itself into
nodeRNNH (Fig. 3):
v t+1 = nodeRNNH (v t , f t , ht ).
[v1 t+1 , ..., v5 t+1 ]
nodeRN NH
node features: class H
node
node features:
features: H-H
0-0 edgeRNN:H-H
edgeRNN:H-L edge features: H-L
edge features: L-L edgeRNN:L-L
node features: class L
nodeRN NL
[u1 t+1 , ..., u4 t+1 ]
Fig. 3. Edge features of the same type are jointly trained by one edge RNN. Nodes
from the same class are jointly trained by one node RNN.
4 Graph Description of Spatial Correlation

The graph is a flexible representation for irregular geographical shapes which is
especially useful for many spatio-temporal forecasting problems. In this work,
we use a weighted directed graph for space description where each node corre-
sponds to a state. There are multiple ways to infer the connectivity and weights
of this weighted directed graph. In the previous work [10], Wang et al. utilized a
multivariate Hawkes process to infer such a graph for crime and traffic forecast-
ing, where the connectivity and weight indicate the mutual influence between
the source and the sink nodes. Alternatively, one can opt for space closeness and
connect the closest few nodes on the graph, and the weight is proportional to
the historical moving average activity levels of the source node. In this work,
we employ the second strategy, where we regard two nodes as connected if the
corresponding two states are geographically adjacent to each other. The graph
in this work is demonstrated in Fig. 4. We will explore the first strategy in the
future.
5 Sparsity Promoting Penalties

The convex sparsity promoting penalty is the 1 norm. In this study, we also
employ a Lipschitz continuous non-convex penalty, the so-called transformed-1 .
Definition 1. The transformed 1 (T1 ) penalty function on x = (x1 , · · · , xd ) ∈

Rd is

d
(a + 1)|xi |
Pa (x) := ρa (xi ), ρa (xi ) = , parameter a ∈ (0, +∞). (3)
i=1
a + |xi |
736 Z. Li et al.
10 8 2 1
5 3
7
9 4
6
(a) HHS regions on map (b) Graph Structure
Fig. 4. HHS graph
Since lima→0+ ρa (xi ) = 1{xi =0} , lima→+∞ ρa (xi ) = |xi |, ∀i, the T1 penalty
interpolates 1 and 0 . For its sparsification in compressed sensing and other
applications, see [14] and references therein. To sparsify weights in GSRNN
training via 1 and T1 , we add them to the loss function of GSRNN with a
multiplicative penalty parameter α > 0, and call stochastic gradient descent
optimizer on Tensorflow. Though a relaxed splitting method [2] can enforce
sparsity much faster, we shall leave this as part of future work on 0 penalty.
Among the previous works on influenza forecasting, ARGO [13] is the current
state-of-the-art prediction model for the entire U.S. influenza activity. To com-
pare with previous works conveniently, we use the CDC data
from 2013 to 2015
n
as our test data. The accuracy is measured in: RMSE = n i=1 (yi − yî )2 .
1
We use a single layer LSTM with 40 hidden units for edge RNNs, and a
three-layer multilayer LSTM with hidden units [10, 40, 10] for node RNNs. We
use the Adam optimizer to train GSRNN. The RMSE of the forecasting from
2013/1/19 to 2015/8/15, 135 weeks in total, is shown in Table 1. We outperform
LSTM and Autoregressive Model of order 3 (AR(3)) in all nodes, and ARGO in 8
nodes, see Fig. 5 for activity plots in each region. It is easy to see that in regions
1, 2, 7 and 8, there are some under-predictions, while GSRNN’s prediction is
almost identical to the ground-truth. The general form of an AR(p) model for
time-series data is
p
Xt = μ + φi Xt−i + ,
i=1
where φ = (φ1 , ..., φp ) is computed through the backshift operator. ARGO [13],
as a refined autoregressive model, models the flu activity level as:

52
100
yˆt = μy + αj yt−j + βi Xi,t + t , [μy , α, β] := arg min (yt − yˆt )2 ,
j=1 i=1 μy , α , β t
t being i.i.d Gaussian noise, Xi,t the log-transformed Google search frequency
of term i at time t.
We observe that ARGO has inconsistent performance over nodes. We believe
this is because the external feature of ARGO, the Google search pattern data,
does not offer useful information, since the national search pattern does not
necessarily apply to a certain HHS region. Meanwhile, we also have much less
computational cost than ARGO, which takes in top 100 search terms related
to influenza as well as their historical activity levels, with a look-back window
length of 52 weeks. During the time for ARGO to compute one node, our model
finishes all the ten nodes.
We sparsify the network through 1 and T1 (Eq. (3) using a = 1 and penalty
parameter α = 10−8 during training). Post training, we hard threshold small
network weights to 0 at threshold 10−3 , and find that high sparsity under T1
regularization is achieved while maintaining the accuracy at the same level, see
Tables 2 and 3. Hard-thresholding improves the predictions for some nodes but
not all of them, however it reduces the inference latency and is thus beneficial
for the overall algorithm.
Table 1. The RMSE between the predicted and ground-truth activity levels by differ-
ent methods over 10 different states.
Node 1 2 3 4 5 6 7 8 9 10
AR(3) 0.242 0.383 0.481 0.415 0.345 0.797 0.401 0.305 0.356 0.317
ARGO 0.281 0.379 0.397 0.335 0.285 0.673 0.449 0.244 0.356 0.310
LSTM 0.271 0.364 0.487 0.349 0.328 0.751 0.421 0.333 0.335 0.310
GSRNN 0.223 0.354 0.374 0.320 0.289 0.664 0.361 0.275 0.284 0.303
Table 2. Percentages of weights <10−3 in absolute value in GSRNN w/ and w/o 1 ,

T1 penalties.
Penalty 1 2 3 4 5
α=0 51.2% 47.8% 50.3% 50.6% 49.9%
l1 (α = 5 · 10−8 ) 67.7% 51.8% 57.7% 60.7% 61.2%
T L1(α = 5 · 10−8 ) 82.3% 58.9% 71.9% 64.2% 71.1%
738 Z. Li et al.
(HHS Region 1) (HHS Region 2)
(HHS Region 7) (HHS Region 8)
Fig. 5. The exact and predicted flu activity levels by GSRNN and ARGO.
Table 3. Node-wise RMSE of GSRNNs via post-training hard thresholding at thresh-

old 10−3 .
Node 1 2 3 4 5 6 7 8 9 10
α=0 0.230 0.351 0.390 0.334 0.314 0.676 0.380 0.297 0.287 0.316
l1 (α = 5 · 10−8 ) 0.234 0.351 0.388 0.327 0.306 0.685 0.363 0.290 0.281 0.296
T L1(α = 5 · 10−8 ) 0.225 0.363 0.379 0.328 0.296 0.690 0.365 0.272 0.311 0.305
We studied epidemic forecasting based on a graph-structured RNN model to take
into account geo-spatial information. We also sparsified the model and reduced
70% of the network weights to zero while maintaining the same level of pre-
diction accuracy. In future work, we plan to (1) explore wider neighborhood
interactions and more powerful sparsification methods, (2) study additional fac-
tors such as environmental conditions, population distribution, transportation
networks, sanitary conditions among others, (3) train RNNs with the recently
developed Laplacian smoothing gradient descent method [6].
Acknowledgments. This material is based on research sponsored by the Air Force

Research Laboratory and DARPA under agreement number FA8750-18-2-0066; the
U.S. Department of Energy, Office of Science, DOE-SC0013838; the National Science
Foundation DMS-1554564 (STROBE), DMS-1737770, DMS-1522383, IIS-1632935. The
authors thank Profs. M. Hyman, and J. Lega for helpful discussions.
References
1. CDC data: https://gis.cdc.gov/grasp/fluview/fluportaldashboard.html
2. Dinh, T., Xin, J.: Convergence of a relaxed variable splitting method for
learning sparse neural networks via 1 , 0 , and transformed-1 penalties (2018).
ArXiv: 1812.05719
3. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8),
1735–1780 (1997)
4. Jain, A., Zamir, A., Savarese, S., Saxena, A.: Structural-RNN: deep learning on
spatio-temporal graphs. In: Conference on Computer Vision and Pattern Recogni-
tion (CVPR 2016) (2016)
5. Nsoesie, E., Brownstein, J., Ramakrishnan, N., Marathe, M.: A systematic review
of studies on forecasting the dynamics of influenza outbreaks. Influenza Other
Respir. Viruses 8(3), 309–316 (2014)
6. Osher, S., Wang, B., Yin, P., Luo, X., Pham, M., Lin, A.: Laplacian smoothing
gradient descent (2018). ArXiv:1806.06317
7. Pascanu, R., Mikolov, T., Bengio, Y.: On the difficulty of training recurrent neu-
ral networks. In: Proceedings of the 30th International Conference on Machine
Learning (2013)
8. Perra, N., Goncalves, B.: Modeling and predicting human infectious disease. In:
Social Phenomena, pp. 59–83. Springer (2015)
9. Volkova, S., Ayton, E., Porterfield, K., Corley, C.: Forecasting influenza-like illness
dynamics for military populations using neural networks and social media. PLOS
one 12(12), e0188941 (2017)
10. Wang, B., Luo, X., Zhang, F., Yuan, B., Bertozzi, A., Brantingham, P.: Graph-
based deep modeling and real time forecasting of sparse spatio-temporal data
(2018). arXiv:1804.00684
11. Wang, B., Yin, P., Bertozzi, A., Brantingham, P., Osher, S., Xin, J.: Deep learning
for real-time crime forecasting and its ternarization (2017). arXiv:1711.08833
12. Wang, B., Zhang, D., Zhang, D., Brantingham, P., Bertozzi, A.: Deep learning for
real-time crime forecasting (2017). arXiv:1707.03340
13. Yang, S., Santillana, M., Kou, S.: Accurate estimation of influenza epidemics using
Google search data via ARGO. Proc. Natl. Acad. Sci. 112(47) (2015)
14. Zhang, S., Xin, J.: Minimization of transformed 1 penalty: theory, difference of
convex function algorithm, and robust application in compressed sensing. Math.
Program. Ser. B 169(1), 307–336 (2018)
Automatic Identification of Intracranial
Hemorrhage on CT/MRI Image Using
Meta-Architectures Improved from
Region-Based CNN
Thi-Hoang-Yen Le1(B) , Anh-Cang Phan1 , Hung-Phi Cao1 ,

and Thuong-Cang Phan2
1
Vinh Long University of Technology Education, Vinh Long, Vietnam
{yenlth,cangpa,caohungphi}@vlute.edu.vn
2
Can Tho University, Can Tho, Vietnam
ptcang@cit.ctu.edu.vn
Abstract. Machine learning algorithms are suggested for detecting and

classifying hemorrhage regions on head CT/MRI images with high accu-
racy. However, most of these algorithms are not interested in the valu-
able characteristic of CT/MRI images, especially Hounsfield Unit values.
Besides, they also only detect and classify one of types of the intracranial
hemorrhage on each image. In this paper, we propose a new approach for
brain hemorrhage identification using object detection algorithms like
Faster R-CNN and R-FCN. The proposed approach can detect many
regions of the brain hemorrhage on a CT image. The results show that
the R-FCN algorithm gives better results than the Faster R-CNN algo-
rithm on time and accuracy of identification.
Keywords: Intracranial hemorrhage · Region-based CNN ·

Meta-architectures · Hounsfield unit
1 Introduction
According to WHO statistics, stroke remains the second leading cause of global
human deaths in the last 15 years [1]. Hemorrhage stroke is known as acute
stroke due to its abrupt symptom onset and rapid deterioration. The hyperten-
sive damage leads to the rupture of cerebral arteries. Blood leak directly into
parenchyma (intracerebral hemorrhage) or subarachnoid space (subarachnoid
hemorrhage). Traumatic brain injury is secondary to accidents with the blows
to the head or shaking, especially traffic accidents whose victims crash their
head. Traumatic cranial can lead to the main intracranial hemorrhage (ICH)
types: epidural hematoma, subdural hematoma, intracerebral hemorrhage and
subarachnoid hemorrhage [2–4].
CT and MRI are two popular radiological methods using for detection and
diagnosis of brain bleeding at hospitals [2]. However, both of CT and MRI depend
https://doi.org/10.1007/978-3-030-21803-4_74
ICH Identification on CT/MRI Image Using Meta-Architectures 741
on the expertise of radiologists and neurosurgeons. They usually use naked eye
to pull out the important symptoms of disease from images [5], which is the
matter of concern in diagnosis and treatment of brain hemorrhage.
With the development of image processing techniques and machine learn-
ing algorithms, a lot of research are performed to determine, identify and seg-
ment bleeding zone in brain. However, almost of typical computer-aid diagno-
sis (CAD) studies utilize traditional computer vision techniques [6]. In those
systems, CT/MRI images is turned through many processing stages such as
enhancement, type transformation, segmentation and feature extracting. There-
fore, the important characteristics of CT/MRI images can be lost during the
processing. For instance, Mahmoud et al. [7] propose an approach to detect and
classify brain hemorrhage on CT images automatically with two main parts:
image processing and classification. Using Otsu’s method in segmentation stage
for detecting brain hemorrhage is the highlight of this study.
On the other hand, deep learning in general and convolutional neural net-
works (CNNs) in particular present the promising results in broad range of state-
of-the-art computer vision tasks such as object detection and image classification
[6,8,9]. Correspondingly, medicine imaging groups also turn their research ori-
entation from analyzing based on tradition techniques to using deep learning
methodology for image analysis with variation of tasks. According to the review
of Bernal et al. [9], diagnosis of brain diseases has also beheld many CNN-based
proposals working on different tasks such as brain tumor detection and clas-
sification brain hemorrhages. Rezaei et al. [10] propose the CNN architecture
with seven convolutional layers and three fully-connected layers for brain abnor-
mality classification on MR images. In approach of Arbabshirani et al. [6], ICH
presence on CT studies is identified with the fully 3-dimensional deep learning
architecture consisting of five convolutional and two fully-connected layers (aside
from max pooling and local normalization layers). Generally, almost of studies
are not implemented directly on DICOM files. Research groups are only inter-
ested in detecting and classifying brain abnormalities. They ignore the important
characteristics of anomalies which affect the diagnosis and patient monitoring. In
medical image analyzing, radiologists usually use Hounsfield Unit (HU) for deter-
mining the abnormalities on CT/MRI images. In addition, the implementation
of new CNN architecture take a lot of time and cost for hardware infrastructure.
Moreover, to learn effectively, CNNs require training set big enough with varia-
tion of classifying cases. DICOM datasets of brain hemorrhage, nevertheless, are
extremely expensive and scarce because of their private information.
From mentioned viewpoints, we propose a combination of HU with deep
learning for identification of ICH. HU is used for detection the hemorrhage
regions. Two type of region-based meta-architectures, Faster R-CNN [11] and
R-FCN [12], are experimented for the accuracy of ICH classification. The rest
of this paper is structured as follows. Section 2 presents the background of our
approach. In the following sections, we illustrate proposed method and experi-
ments. The conclusion and perspectives for future work are drawn in the final
section.
742 T.-H.-Y. Le et al.
2 Background
2.1 Meta-Architectures Improved from the Strategy of R-CNN
The combination of region proposals with CNNs (R-CNN) drives the advances
of CNN-based object detection approaches [11–13]. Steps of original R-CNN are
pretty intuitive with two states: proposing regions and classifying region pro-
posals with features extracted from them. However, its performance is very slow
because it not share convolutional computations among regions [11,12]. After the
introduction of R-CNN, a lot of approaches are suggested to improve it. With
using region proposal network (RPN) instead of Selective Search for proposing
regions, Faster R-CNN and R-FCN really stand out among the approaches.
Faster R-CNN is an improvement of Fast R-CNN which is the immedi-
ate descendant of R-CNN. Fast R-CNN run only one CNN to extract features
over entire of image before generating region proposals, i.e. regions are pro-
posed based on the last feature map of CNN, not from the input image. SVM
classifiers in the original is also replaced with a softmax layer. In other words,
instead of creating a new model, Fast R-CNN extends the neural network for
predictions [14]. The remaining problem of Fast R-CNN is Selective Search used
for proposing regions. Selective Search is one of most common region proposal
methods with the greedy mergers, but its implementation is slower compared
to efficient detection networks. Faster R-CNN solves the bottleneck of Fast R-
CNN with using RPN which is a deep convolutional network instead of Selective
Search for proposing regions. It can be said that Faster R-CNN is composition
of RPN and Fast R-CNN in a single, unified network for object detection. In
which, Fast R-CNN is used to detect the class of region proposals, results of
RPN module. Moreover, Ren et al. perform alternating training with 4 steps for
the idea sharing computation between RPN and Fast R-CNN. They start by
training RPN. The following step, they train a separate detection network (Fast
R-CNN) with the proposed regions of the 1-step RPN. Both RPN and detection
network are initialized with an ImageNet-pre-trained model. However, RPN is
fine-tuned end-to-end for the region proposals task. After the second step, the
two networks not share convolutional layers. Next, RPN and detection network
are continuously trained in turn with fixing the shared convolutional layers and
only fine-tuning their unique layers [11].
With the comparable idea, improving speed of R-CNN by sharing computa-
tion across region proposals and between networks, R-FCN increase speed by
maximizing the sharing computation. The design of R-FCN is adopted the pop-
ular strategy of CNN-based object detection with two-stage: generating region
proposals (region-of-interest, RoIs) by fully convolutional architecture of RPN
and classifying candidate regions. All learnable weight layers of R-FCN are con-
volutional and are computed on the entire image. In addition, Dai et al. are
interested in compromising between the translation invariance for image-level
classification and translation variance for object detection when convolutional
computations are shared across 100% of the net. With the inspiration from
developing FCNs for instance-level semantic segmentation, they give the con-
cept of position-sensitive score maps which are convolutional feature maps and
are trained to recognize the certain parts of each object. k 2 score maps present
relative positions (e.g. k = 3 corresponds to 9 relative positions: top-left, top-
middle, top-right, etc.) of one object class. Moreover, the position-sensitive RoIs
pooling layer is introduced to shepherd the learning based on score maps for
object detection. Like Faster R-CNN, RPN and R-FCN share features according
to the 4-step alternating training [12].
2.2 Transfer Learning
To efficiently train a new CNN architecture requires a sufficient size of dataset,

yet it is rare to collect and specify the acceptable amount of data in specialized
problems such as ICH diagnosis. Therefore, it is more common to used a CNN
pretrained on a very large dataset (e.g. ImageNet, MS COCO, etc.) with either
initialization or appropriate adjustments for task of interest. In machine learning,
this technique is called Transfer Learning which is able to optimize the progress
and performance of learning [15,16]. Three major scenarios of Transfer Learning:
(1) After training CNN on a reliable dataset, the last fully-connected layer (i.e.
layer computes the class scores) is replaced and retrain to be in tune with new
dataset and the rest of CNN is used as the fixed feature extractor; (2) Aside
from replacing and retraining classifier on top of network, fine-tuning can be
perform on the weights, all of layers or some higher-level portion of the pretrained
network; (3) Adjusting the pretrained models released by other people with
their published checkpoints for interesting task [15]. Considering the size of new
dataset (small or large) and its similarity to the original dataset is necessary to
choose acceptable type of Transfer Learning [15,16].
2.3 Hounsfield Unit (HU) of Brain Hemorrhage
DICOM (Digital Imaging and Communications in Medicine) is the international

standard of medical imagery. DICOM integrate CT/MRI scanning result with
information related to patient, image-acquisition device, etc. Its implementation
is compatible with almost every radiology, cardiology imaging, and radiotherapy
device (e.g. X-ray, CT and MRI). It, therefore, meets demand to store, transmit,
retrieve and process medical imaging data [17].
In CT/MRI scanning, each slice is divided into matrix of squares (voxels) with
popular size of 256×256 or 512×512. Volume of these voxels depends on the slice
thickness. The absorbed radiation degree of tissues within each voxel is called the
CT number or Housfield Unit (HU) [2,3]. It is an importantly meaningful value
on CT/MRI images. Types of tissue, water and air show different HU values,
which illustrate in Table 1. In the head, aside from bone which is the densest
structure and presents white on CT/MRI, blood is thought to be hyperdense.
However, with time, blood would become isodense and then hypodense compared
to brain parenchyma because of clot resorption [2–4]. CT numbers, therefore,
could help segment hemorrhagic zone and detect the hemorrhage time interval.
Table 1. The absorbed radiation degree of matters in brain by Hounsfield Units [18, 19]
Matter Density (HU)

Water 0
Bone 1,000
Air −1,000
Gray matter 35–40
White matter 20
Hematoma 40–90
Each HU value is assigned a value of gray-level on the display monitor and

presented as a pixel of image. Window technique utilizes a narrow interval of
HU values are assigned to the entire gray-scale to enhance the contrast. Two
important parameters affect displaying result: window width is the range of CT
numbers displayed on the whole of gray-scale and the average value is called the
window level. Adjusting of the former alters the contrast, while changing the
latter helps select the interest structure to display on monitor [2,3].
In DICOM file, HU value of each voxel could be computed as following:
HU value = pixel value ∗ RescaleSlope + RescaleIntercept (1)
Where, pixel value (actually is voxel value) is the absorbed radiation degree
of voxel in image data of DICOM. RescaleSlope and RescaleIntercept are values
stored in DICOM tags respectively and specify the linear transformation of data
[19,20].
3 Proposed Method
3.1 Converting DICOM to PNG Based on Window Technique
To identify ICH zones, interest structure, it is necessary to pick up them from
brain CT/MRI images. In practice, Hounsfield Unit is a important quantity
supporting for specialist to detect ICH. As a result, we apply it in our experiment.
Besides, in our approach, it is necessary to label data used for retrain network
models. However, range of CT numbers (exceeding 2000 values) recorded by
modern CT/MRI scanners outperforms computer visibility with popular gray-
scale from 0 to 255 [2,20]. Therefore, after computing HU values of pixel, we
would convert DICOM to PNG with windowing. According to this technique,
HU values above (window level+window width/2) are assigned white and those
below (window level − window width/2) are referred to black.
3.2 Retraining Faster R-CNN and R-FCN on ICH Dataset

TensorFlow Object Detection API is an open source framework built on top
of TensorFlow and make easy to construct, train and deploy object detection
Fig. 1. Converting DICOM to PNG
models. To evaluate speed/accuracy of object detection systems, Huang et al.

[22], authors of framework, conducted experiments on three meta-architectures
(Faster R-CNN, R-FCN and SSD) in TensorFlow with 6 feature extractors (e.g.
VGC16, ResNet-101), different default image resolutions and hyperparameter
tuning. Their networks were trained end-to-end using asynchronous gradient
updates on a distributed cluster on Microsoft COCO dataset which is large-
scale object detection, segmentation, and captioning dataset. In Faster R-CNN
and R-FCN networks of Huang et al., TensorFlow’s “crop and resize” operation
is used to replace ROI Pooling layer and Position-sensitive ROI Pooling layer
utilized in original models.
Fig. 2. Implemetation process
After converting images from DICOM to PNG, ICH regions used for retrain-
ing network models will be labeled into 4 main types of ICH. In the following
steps, training and evaluation data are generated in the form of tfrecord files.
The training result of each model is the frozen inference graph (.pb file) con-
taining respective ICH detection classifier. Figure 2 present our implementation
process to retrain and test models. After testing, the better classifier will be
integrate to CAD system supporting ICH diagnosis.
4 Experiments
We implement experiment detecting and classifying 4 types of ICH according
to Faster R-CNN and R-FCN using ResNet-101 architecture. Our dataset is
collected from Can Tho University Hospital with 250 axial head CT slices which
have manifestation of brain hemorrhage (365 regions).
4.1 Data Preprocessing
Converting DICOM to PNG

We experiment with many values of window width and window level for seg-
menting ICH regions on CT slices. Two values 56 and 63 are chosen respectively
because they give the best resolution brain images with the emergence of hem-
orrhage, region of interest. They also give upper and lower thresholds within the
HU range of hematoma illustrated in Table 1.
Preparing Data for Retraining Networks to Detect and Classify ICH
Because each slice can have more than one hemorrhage region of the same or
different ICH types, we select randomly 200 slices with 288 ICH regions from
dataset as in Table 2. The next step, with support of radiologists, we specify
truth bounding box of hemorrhagic zone and assign labels to boxes by LabelImg
[21]. Annotations are written as XML files in PASCAL VOC format used by
ImageNet [21].
Table 2. The amount of data using for retrain network models
Type of ICH The number of hemorrhage regions

Train Set Eval Set
Subdural hematoma 78 14
Epidural hematoma 56 5
Subarachnoid hemorrhage 50 6
Intracerebral hemorrhage 68 11
Total 252 36
Table 3 presents the rest of dataset (50 images with 77 hemorrhage regions)
which is used for evaluating the accuracy of the trained network models. Aside
from converting image data to PNG, some important information of medical
record stored in tags of DICOM is saved to integrate with classification result
to support for ICH diagnosis. Based on HU values, the hemorrhagic time is also
recorded.
4.2 Retraining COCO-Trained Faster R-CNN and R-FCN Models
In our experiment, we retrain two COCO-trained models, rfcn resnet101 coco

and faster rcnn resnet101 coco of Huang on our ICH dataset. The reason is that
they use ResNet-101 for extracting features, which is pretty equivalent with
Table 3. The amount of data using for testing
Type of ICH The number of hemorrhage regions

Subdural hematoma 28
Epidural hematoma 13
Subarachnoid hemorrhage 12
Intracerebral hemorrhage 24
Total 77
original architectures in [11,12]. In addition, Google Cloud Engine is used for

supplying hardware platform for our experiments. An machine is configured with
details as following: 4 vCPUs Intel Xeon Scalable Processor (Skylake), 1 Nvidia
Tesla P100 16GB GPU card, 16GB RAM, and 100GB SSD.
Table 4. The result of classifying ICH based on Faster R-CNN
Type of ICH TP FN FP TN Precision Recall F1-Score

Subdural hematoma 21 7 5 49 0.808 0.75 0.778
Epidural hematoma 13 0 0 64 1.0 1.0 1.0
Subarachnoid hemorrhage 10 2 4 61 0.714 0.833 0.769
Intracerebral hemorrhage 19 5 2 51 0.905 0.792 0.845
Table 5. The result of classifying ICH based on R-FCN
Type of ICH TP FN FP TN Precision Recall F1-Score

Subdural hematoma 18 10 7 45 0.72 0.643 0.679
Epidural hematoma 13 0 0 64 1.0 1.0 1.0
Subarachnoid hemorrhage 9 3 1 65 0.9 0.75 0.812
Intracerebral hemorrhage 22 2 0 53 1.0 0.917 0.957
After generating respective label csv files of Train Set and Eval Set, we cre-
ate TFRecord files using for training networks. After TensorFlow initializes
the training, each step reports the loss. This value starts at 2.6 and rapidly
descends throughout the progress. The training stops when Loss(L) and Detec-
tionBoxes Precision mAP@.50IOU (P) reach the saturation. In which, P cal-
culated based on coco detection metrics is mean average precision of detection
boxes at 50% IOU. It takes about 30000 steps (2 h 9 min) and 77000 steps (5 h
30 min), for Faster R-CNN (L ≤ 0.06 and P ≈ 90.56%) and R-FCN (L ≤ 0.03 and
P ≈ 86.46%) respectively.
We test classifiers on data presented in Table 3. It take an average of 0.19

and 0.13 s for Faster R-CNN and R-FCN respectively to identify ICH regions on
one image. Figure 3 show some identification results of different ICH types with
R-FCN.
Fig. 3. Results of ICH identification
Owing to the imbalance of assessment data, micro-average and macro-

average are used for evaluating accuracy. They are valuations based on Precision-
Recall. Our experiment show that the accuracy of R-FCN is better one of Faster
R-CNN and quite high to classify ICH regions as in Tables 4, 5 and 6.
Table 6. The evaluation results with micro-average and macro-average
Evaluation Faster R-CNN R-FCN

Precision Recall F1-Score Precision Recall F1-Score
Micro-average 0.851 0.818 0.834 0.886 0.805 0.844
Macro-average 0.857 0.844 0.85 0.905 0.826 0.864
5 Conclusion
Our research aims to detect brain hemorrhage regions on CT/MRI images and
classify them into four main types of ICH with meta-architecture using Faster
R-CNN and R-FCN enhanced from the R-CNN method. Although the training
takes a lot of time for R-FCN, it gives better results on time and accuracy of ICH
identification. HU is the important quantity to support the diagnosis of ICH. In
practice, radiologists usually use HU to detect ICH regions and determine the
time of bleeding. Therefore, our further research will have an integration of
HU into ICH classifiers based on CNN in general, and R-CNN in particular to
support the diagnosis and treatment of ICH.
References
1. WHO: The top 10 causes of death. http://www.who.int/en/news-room/fact-
sheets/detail/the-top-10-causes-of-death. Last accessed 19 Nov 2018
2. Holmes, E.J., Misra, R.R.: Interpretation of Emergency Head CT: A Practical
Handbook, 2nd edn. Cambridge University Press, United Kingdom (2017)
3. Pham, N.H., Le, V.P.: CT in Head Injuries, 1st edn. Medical Publishing House,
Vietnam (2011)
4. Ly, N.L., Dong, V.H.: Traumatic Brain Injuries, 1st edn. Medical Publishing House,
Vietnam (2013)
5. Fatima, Sridevi, M., Saba, N., Kauser, A.: Diagnosis and classification of brain
hemorrhage using CAD system. In: Proceeding of NCRIET-2015 and Indian J.
Sci. Res. 12(1), 121–125 (2015) (Indian)
6. Arbabshirani, M.R., Fornwalt B.K., Mongelluzzo, G.J., Suever, J.D., Geise, B.D.,
Patel, A.A., Moore, G.J.: Advanced machine learning in action: identification of
intracranial hemorrhage on computed tomography scans of the head with clinical
workflow integration. NPJ Digit. Med. 1(9) (2018)
7. Mahmoud, A-A., Duaa, A., Khaldun Al-D., Inad, A.: Automatic detection and clas-
sification of brain hemorrhages. WSEAS Trans. Comput. 10(12), 395–405 (2013)
8. Geenspan, H., van Ginneken, B., Summers, R.M.: Guest editorial deep learning in
medical imaging: overview and future promise of an exciting new technique. IEEE
Trans. Med. Imaging 35(5), 1153–1159 (2016)
9. Bernal, J., Kushibar, K., Asfaw, D.S., Valverde, S., Oliver, A., Martı́, R.: Deep con-
volutional neural networks for brain image analysis on magnetic resonance imaging:
a review. Artif. Intell. Med. (2018)
10. Rezaei, M., Yang, H., Meinel, C.: Brain abnormality detection by deep convolu-
tional neural network (2016). arXiv preprint arXiv:1708.05206v1
11. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object
detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell.
39(6) (2015)
12. Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully
convolutional networks. In: NIPS, pp. 379–387. Curran Associates Inc (2016)
13. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accu-
rate object detection and semantic segmentation. In: 2014 IEEE Conference on
Computer Vision and Pattern Recognition (2014)
14. Girshick, R.: Fast R-CNN. In: 2015 IEEE International Conference on Computer
Vision. IEEE. Santiago, Chile (2015). https://doi.org/10.1109/ICCV.2015.169
15. Johnson, J., Karpathy, A.: The notes accompany the Standford CS class CS231n:
convolutional neural networks for visual recognition (tranfer learning). http://
cs231n.github.io/transfer-learning/. Last accessed 28 Dec 2018
16. Brownlee, J.: A gentle introduction to transfer learning for deep learning. https://
machinelearningmastery.com/transfer-learning-for-deep-learning/. Last accessed
28 Dec 2018
17. NEMA’s DICOM Homepage. http://www.dicomstandard.org/. Last accessed 31
Dec 2018
18. Hounsfield units-scale of HU, CT numbers. http://radclass.mudr.org/content/
hounsfield-units-scale-hu-ct-numbers. Last accessed 31 Dec 2018
19. Phan, A.-C., Phan, T.-C, Vo, V.-Q., Le, T.-H.-Y.: Automatic detection and clas-
sification of brain hemorrhage on CT/MRI images. In: 2017 National Conference.
Science and Technics Publishing House, Quy Nhon, Vietnam (2017)
20. Converting CT data to Hounsfield Unit. https://www.idlcoyote.com/fileio tips/

hounsfield.html. Last accessed 31 Dec 2018
21. LabelImg. https://github.com/tzutalin/labelImg. Last accessed 01 Jan 2019
22. Huang, J., Rathod, V., Sun, C., Zhu, M., Korattikara, A., Fathi, A., Fischer, I.,
Wojna, Z., Song, Y., Guadarrama, S., Murphy, K.: Speed/accuracy trade-offs for
modern convolutional object detectors. In: CVPR 2017
Bayesian Optimization for Recommender
System
Bruno Giovanni Galuzzi1(B) , Ilaria Giordani1 , A. Candelieri1 , Riccardo

Perego1 , and Francesco Archetti1,2
1
University of Milano-Bicocca, 20125 Milan, MI, Italy
bruno.galuzzi@unimib.it
2
Consorzio Milano-Ricerche, 20126 Milan, MI, Italy
Abstract. Many web services have a Recommender System to help the

users in their choices such as movies to watch or products to buy. The
aim is to make accurate predictions on the user preferences depending
on his/her past choices. Matrix-factorization is one of the most widely
adopted method to build a Recommender System. Like many Machine
Learning algorithms, matrix-factorization has a set of hyper-parameters
to tune, leading to a complex expensive black-box optimization problem.
The objective function maps any possible hyper-parameter configuration
to a numeric score quantifying the quality of predictions. In this work,
we show how Bayesian Optimization can efficiently optimize three hyper-
parameters of a Recommender System: number of latent factors, regu-
larization term and learning rate. A widely adopted acquisition function,
namely Expected Improvement, is compared with a variant of Thompson
Sampling. Numerical for both a benchmark 2-dimensional test function
and a Recommender System evaluated on a benchmark dataset proved
that Bayesian Optimization is an efficient tool for improving the predic-
tions of a Recommendation System, but a clear choice between the two
acquisition function is not evident.
Keywords: Recommender System · Bayesian Optimization ·

Hyper-parameter optimization
1 Introduction
A Recommender System (RS) is a critical component of B2C online services,

aimed at recommending items (e.g., movies, songs, books, etc.) which match
the user’s preferences. RSs are widely used for recommending movies [15], music
[16], books [7], e-commerce products [17]. RSs can be basically divided in two
classes [1]: content-based and collaborative filtering (CF). The first class aims at
profiling both users and items depending on some characteristic features, such as
demographic data for users and information/descriptions for items. The second
class requires less data, basically a list of tuples containing the user ID, the item
ID, and the rating done by the user to that item. In this paper we address CF
https://doi.org/10.1007/978-3-030-21803-4_75
752 B. Galuzzi et al.
only. The basic data structure is named rating matrix, where each entry is the
rating of a user on an item. This matrix has usually a small number of known
entries: the aim is to predict the remaining values using Machine Learning. A
specific CF method is known as model-based [4]: it predicts the unknown ratings
by assuming that they can be inferred depending on a small number of latent fac-
tors. The most successful latent factor models are based on matrix-factorization
(MF) [15,24], where the latent factors are learned by identifying a low-rank
approximation of the rating matrix, with the assumption of correlations between
rows (or columns) to guarantee the dimensionality reduction of the matrix itself.
The identification of this approximation is a minimization problem, where the
goal is to determine two low-rank matrices whose product is as close as possible
to the rating matrix. The approximation error depends on the number of latent
factors (i.e. rank), which is an hyper-parameter of the algorithm. Other two
hyper-parameters are a regularization term, added to reduce the non-linearity
effect and over-fitting, and the learning rate of a stochastic gradient descent
[3] procedure used to generate the low-rank approximation. The best values of
these hyper-parameters are unknown a priori. Recently, Bayesian Optimization
(BO)[9,25] is becoming the most widely adopted strategy for global optimiza-
tion of multi-extremal, expensive-to-evaluate and black-box objective functions
in robotics [20], sensor networks [10], drug design [18], simulation-optimization
problem [5], inversion problems [22] and automated Machine Learning [6]. Exam-
ples of BO applied to RSs can be found in [8,27].
In this paper, we propose BO for the optimization of the hyper-parameters
of a CF based RS. The rest of the paper is organized as follows. Section 2
introduces CF, Sect. 3 describes how BO is used to optimize the learning process
underlying a RS, in Sect. 4 BO is initially investigated on a benchmark test
function generated through the GKLS software [11], a generator of test functions
with known local and global minima for multi-extremal multidimensional box-
constrained global optimization. Finally, we report results of BO for RS based
on the benchmark dataset MovieLens-100k [13].
2 The Problem Definition
CF is based on two sets [26]: the set of users U = {u1 , u2 , . . . , uM } and the
set of items I = {i1 , i2 , . . . , iN }. A rating rui ∈ X represents the preference of
the user u for the item i: it can be a Boolean or an integer value. The ratings
given by the users on the items are organized in a matrix R ∈ RM × N , namely
rating matrix. Usually, each user rates only a small number of items, thus the
matrix entries are known only for a small number of positions (u, i) ∈ S, with
|S| << M × N . The set S is divided in a training set ST rain and a test set
ST est with ST rain ∩ ST est = ∅ and ST rain ∪ ST est = S. The aim of CF is to make
predictions for ST est using only the knowledge on ST rain , where the quality of
Bayesian Optimization for Recommender System 753
predictions is measured, for instance, through root mean square error (RMSE):

1 2
RM SE = (rui − r̂ui ) (1)
|ST est |
(u,i)∈ST e
where r̂ui denotes the prediction of the actual rating rui . The idea behind MF
techniques is to approximate the matrix R as the product of two matrices R ≈
P · Q, where P is a M × K and Q is a K × N matrix. The matrix P is called the
user-feature matrix, Q is called the item-feature matrix, and K is the number of
latent factors (features) in the given factorization. Typically, K << M, N , and
both P and Q contain real numbers, even when R contains only integers. The
matrix factorization is obtained by minimizing an error function (e.g., RMSE)
on the training set ST rain as a function of the matrices (P, Q). Therefore, the
optimization problem becomes
⎡ ⎡ 2 ⎤⎤
K K

1
2
argmin⎣ ⎣ rui − puk · qki +λ puk + qki 2 ⎦⎦ (2)
(P,Q) 2
(u,i)∈ST rain k=1 k=1
where puk and qki denote the elements of P and Q, respectively, and λ ≥ 0 is
the regularization factor.
While minimization of (Eq. 2) is performed to learn the matrix factorization,
the computation of RMSE on the test set ST est allows to estimate how good
could be that approximation in predicting new ratings. Usually, to have a robust
estimate, RMSE is computed on k fold-cross validation.
During the learning of the matrix factorization, both puk and qki are identified
through stochastic gradient descent, where the update is stochastically approxi-
mated in terms of the error in a (randomly chosen from a uniform distribution)
observed entry (i, j) as follows:

puk = puk + η · (eui · qki − λ · puk )
(3)
qki = qki + η · (eui · puk − λ · qki )
K
where eui = rui − k=1 puk · qki and η is the learning rate.
3 Bayesian Optimization for Hyper-parameter

Optimization
3.1 Hyper-parameter Optimization

Let A be an algorithm with n hyper-parameters to be tuned and where each
hyper-parameter θi can take a value in the interval [ai , bi ]. The search space
in terms of possible configurations of the hyper-parameters is therefore Θ =
[a1 , b1 ] × ... × [an , bn ]. Given an error function H : Θ → R+ that maps each con-
figuration θ ∈ Θ to a numeric value, the aim of the hyper-parameter optimization
is to find the best configuration θ∗ minimizing H (θ):
θ∗ = argmin H (θ) (4)

θ∈Θ
In the case considered, H is the RMSE computed on k fold-cross validation

and Θ consists of the three CF hyper-parameters: number of latent factors K,
regularization term λ and learning rate η. The function H is usually (i) expensive
to evaluate, in the sense that each evaluation takes a substantial amount of
time, and (ii) black-box, as we do not known properties about its structure
(e.g., convexity, derivatives, Lipschitz continuity, etc.). Thus, hyper-parameters
optimization is hard, even more when the goal is to obtain a good configuration
within few function evaluations.
3.2 Bayesian Optimization

Bayesian Optimization is a sample efficient procedure for sequential optimization
based on two components: a probabilistic surrogate model approximating the
unknown objective function and an acquisition function driving the choice of
the next point - that is the next algorithm’s configuration in the case of hyper-
parameter optimization - where to evaluate the objective function. The most
used surrogate model is the Gaussian Process (GP), which is a collection of
random variables, any nite number of which have a joint Gaussian distribution.
A GP is completely specified by a mean function μ (θ) : Θ → R and a definite
positive covariance function, also called kernel, k (θ, θ ) : Θ2 → R,
H(θ) ∼ N (μ(θ), k(θ, θ)) (5)
The most common choice for GP kernel is the Squared Exponential kernel.
The BO algorithm starts with an initial set of n configurations θi=1:n and
their associated function values yi=1:n , with yi = H(θi ).
At each iteration t = n + 1, . . . , N the GP is fitted by conditioning its
mean and variance to the set of function evaluations performed so far, Dt =
{(θi , yi )}i=1:n . For any configuration θ ∈ Θ, the posterior mean μt (θ) and the
posterior variance σt2 (θ) of the GP, conditioned on Dt , are known in closed-form:
T −1
μt (θ) = k (θ) K + τ 2I y (6)
T
σt2 (θ) = k (θ, θ) − k (θ) [K + τ 2 I]−1 k(θ) (7)
where K is the t × t matrix whose entries are Ki,j = k (θi , θj ) , k (θ) is the t × 1
vector of covariance terms between θ and θi=1:n , y is the t × 1 vector whose ith
entry is yi , and τ 2 is the noise variance. When a new point θt+1 is selected and
evaluated it provides a new observation yt+1 = H(θt+1 ), so we can add the new
pair (θt+1 , yt+1 ) to the current set of function evaluations Dt , updating it for
the next BO iteration: Dt+1 = Dt ∪ (θt+1 , yt+1 ).
The next candidate point to evaluate is selected by solving an auxiliary opti-

mization problem, typically of the form:
θt+1 = arg maxUt (θ; Dt ) (8)

θ∈Θ
where Ut is the acquisition function to maximize. This auxiliary optimization

problem is usually cheaper than the original one. One of the most used acqui-
sition functions is the Expected Improvement (EI) [19]. Another acquisition
function comes from Thomson Sampling (TS) [2,14], a randomized strategy for
sequential decision making under uncertainty. At each step t, TS draws a sample
from the GP’s posterior and then selects, as next candidate point, the minimizer
of this sample. Thus, Ut in Eq. (8) is a sample from the GP. Since TS can be
biased towards exploitation, it can be stuck in local minima. To avoid this unde-
sired behaviour, we adopted the -greedy strategy proposed in [2] where, with
probability the next candidate point is selected uniformly at random, other-
wise following the TS procedure. TS is particularly interesting for the converge
results reported in [2], where the authors proved that, under given assumptions,
the -greedy TS has an exponential rate of convergence to the true optimal. To
implement BO with EI we used the Scikit-Optimize package [21], whereas to
implement BO with -greedy TS we used the the R package DiceKriging [23].
4 BO on a Benchmark Test Function
In this section, we report the results of a preliminary comparison, performed on

a 2-dimensional test function, between BO with EI, namely BO-EI, and -greedy
TS, namely BO-TS (with = 0.4). The test function has been generated through
the GKLS function generator [12]. Figure 1a and b show the test function as a
surface in the 3-dimensional space and as a 2-dimensional projection, respec-
tively, with the global minimum in x∗ = [−0.12, 0.12] with f (x∗ ) = −1.5 and 19
other local minima.
Fig. 1. a A GKLS generated test function: as a surface in 3-dimensional space and b

a 2-dimensional projection.
In Fig. 2a, b we report the best function value observed with respect to the
number of function evaluations. More precisely, 5 initial function evaluations
were performed, randomly, to have a first set of observations to train the GP;
then further 45 function evaluations were used for the optimization process. Ten
different tests have been performed for BO with EI and TS, separately, but the
same initial designs were used for the two approaches in every test. Although TS
in one case was not able to reach an optimal solution close to the global optimum,
the number of tests converging to the optimal solution before 30 evaluations is
higher than BO with EI.
Fig. 2. Best function value observed with respect to the number of function evaluations
5 Application
The MovieLens dataset was collected by the GroupLens Research Project at

the University of Minnesota. This dataset consists of 100.000 ratings (1–5) from
943 users on 1682 movies; thus, the rating matrix consists of 943 rows and 1682
columns, with around 100.000 known entries. For matrix-factorization we used
the Python scikit library named Surprise (http://surpriselib.com/). The default
setting of the hyper-parameters for the matrix-factorization is: 0.02 for λ, 0.005
for η, and 100 for K. As performance measure H we used the RMSE computed
on 10 fold-cross validation, whose value, for the default setting of the hyper-
parameters, is 0.9296. We defined the search space of possible configurations,
Θ by setting the following ranges: number of latent factors K in {10, 50}, both
learning rate η and regularization term λ in [0.001, 0.1].
In Fig. 3a, b we report the best function value observed with respect to the
number of function evaluations. More precisely, 5 initial function evaluations
were performed, randomly, to have a first set of observations to train the GP;
then further 25 function evaluations were used for the optimization process. Five
different tests have been performed for BO with EI and TS, separately, but the
same initial designs were used for the two approaches in every test. At the end
of the 30 function evaluations, the best value identified by BO-EI (Fig. 3a) is,
on average, 0.9069 (standard deviation = 0.0003), while for BO-TS (Fig. 3b) is
0.9082 (standard deviation = 0.0011).
Fig. 3. Best RMSE on 10 fold-cross validation with respect to the number of function
evaluations
In Tables 1 and 2 we report the best configurations identified for each test
and the function value (i.e., RMSE on 10 fold-cross validation).
Table 1. Best function value observed, λ, η, and K for the five tests with BO-EI
Test Best RMSE λ η K

1 0.9060 0.1000 0.0208 50
2 0.9069 0.0852 0.0208 50
3 0.9069 0.0852 0.0208 50
4 0.9065 0.0901 0.0208 50
5 0.9069 0.1000 0.0208 50
To better understand the complexity of the objective function, in Fig. 4 we

report its representation as a function of λ and K, where the value of the η
has been fixed to the best value from the previous tests. The objective function
varies between 0.905 and 1.1: more precisely, for λ < 0.05, it decreases with λ
and increases with K. In this part of the figure, objective function values are
higher than 0.93. The worst case is when λ ≈ 0 (i.e., no regularization) and
K > 20; consequently the function has values greater than 1. For λ > 0.05, the
values of the function are lower.
Table 2. Best function value observed, λ, η, and K for the five tests with BO-TS
Test Best RMSE λ η K

1 0.9095 0.1000 0.0307 50
2 0.9071 0.0901 0.0208 46
3 0.9094 0.0703 0.0109 44
4 0.9082 0.1000 0.0208 50
5 0.9068 0.0950 0.0208 32
Fig. 4. RMSE on 10 fold-cross validation as a function of λ and K, keeping η fixed
6 Conclusions
The aim of a RS, based on CF, is to predict the preferences of users on new
incoming, items according to past ratings, that are stored as triples user-item-
score. The core of CF is, usually, a matrix-factorization procedure characterized
by, at least, three different hyper-parameters. As most of the Machine Learning
algorithms, the effectiveness of CF depends on a suitable tuning of its hyper-
parameters, leading to the optimization of a black-box and expensive-to-evaluate
loss function. In this paper, we showed how hyper-parameter optimization for a
CF based RS can be efficiently performed through BO, considering two different
acquisition functions: EI and -greedy TS. Results on a 2-dimensional test func-
tion generated through the GKLS software proved that BO is able to get close
to the global optima in a limited number of function evaluations. Furthermore,
TS proved to converge faster to the optimal with respect to EI. These results
were confirmed also when BO was used to optimize the hyper-parameters of a
CF based RS: after 10 function evaluations the best function value identified
by BO-TS was always lower than 0.95. On the other hand, the best function
value obtained by BO-EI after 30 function evaluations was better than BO-TS
in 4 out of 5 tests. Summarizing, BO proved to be a suitable tool for optimizing
a CF based RS but it is difficult to choose between the two acquisition func-
tions considered. Future works will consider to combine -greedy TS and EI in
order to exploit the convergence property of the first and the empirical good
performances of the second.
References
1. Aggarwal, C.C.: Recommender Systems. Springer International Publishing (2016).
https://doi.org/10.1007/978-3-319-29659-3
2. Basu, K., Ghosh, S.: Analysis of Thompson Sampling for Gaussian Process Opti-
mization in the Bandit Setting (2017). arXiv preprint arXiv:1705.06808
3. Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: Pro-
ceedings of COMPSTAT 2010-19th International Conference on Computational
Statistics, Keynote, Invited and Contributed Papers, pp. 177–186 (2010). https://
doi.org/10.1007/2F978-3-7908-2604-3 16
4. Cacheda, F., Carneiro, V., Fernández, D., Formoso, V.: Comparison of collabora-
tive filtering algorithms. ACM Trans. Web 5(1), 1–33 (2011). https://doi.org/10.
1145/1921591.1921593
5. Candelieri, A., Perego, R., Archetti, F.: Bayesian optimization of pump operations
in water distribution systems. J. Glob. Optim. 71(1), 213–235 (2018). https://doi.
org/10.1007/s10898-018-0641-2
6. Candelieri, A., Giordani, I., Archetti, F., Barkalov, K., Meyerov, I., Polovinkin,
A., Sysoyev, A., Zolotykh, N.: Tuning hyperparameters of a SVM-based water
demand forecasting system through parallel global optimization. Comput. Oper.
Res. (2018). https://doi.org/10.1016/j.cor.2018.01.013
7. Crespo, R.G., Martı́nez, O.S., Lovelle, J.M.C., Garcı́a-Bustelo, B.C.P., Gayo,
J.E.L., Pablos, P.O.D.: Recommendation system based on user interaction data
applied to intelligent electronic books. Comput. Hum. Behav. 27(4), 1445–1449
(2011). https://doi.org/10.1016/j.chb.2010.09.012
8. Dewancker, I., McCourt, M., Clark, S.: Bayesian Optimization for Machine Learn-
ing : A Practical Guidebook (2016). arXiv preprint arXiv:1612.04858
9. Frazier, P.I.: A Tutorial on Bayesian Optimization (2018). arXiv preprint
arXiv:1807.02811
10. Garnett, R., Osborne, M.A., Roberts, S.J.: Bayesian optimization for sensor set
selection. In: Proceedings of the 9th ACM/IEEE International Conference on Infor-
mation Processing in Sensor Networks-IPSN 2010, Stockholm, pp. 209–219 (2010).
https://doi.org/10.1145/1791212.1791238
11. Gaviano, M., Kvasov, D.E., Lera, D., Sergeyev, Y.D.: Algorithm 829: software for
generation of classes of test functions with known local and global minima for
global optimization. ACM Trans. Math. Softw. 29(4), 469–480 (2003). https://
doi.org/10.1145/962437.962444
12. Gaviano, M., Kvasov, D., Lera, D., Sergeyev, Y.D.: Software for generation of
classes of test functions with known local and global minima for global optimiza-
tion. ACM Trans. Math. Softw. 29(4), 469–480 (2003)
13. Harper, F.M., Konstan, J.A.: The movielens datasets. ACM Trans. Interact. Intell.
Syst. 5(4), 1–19 (2015). https://doi.org/10.1145/2827872
14. Kandasamy, K., Krishnamurthy, A., Schneider, J., Póczos, B.: Parallelised Bayesian
optimisation via Thompson sampling. In: International Conference on Artificial
Intelligence and Statistics, pp. 133–142 (2018)
15. Koren, Y., Bell, R., Volinsky, C.: Matrix factorization techniques for recommender
systems. Computer 42(8), 30–37 (2009). https://doi.org/10.1109/MC.2009.263
16. Lee, S.K., Cho, Y.H., Kim, S.H.: Collaborative filtering with ordinal scale-based
implicit ratings for mobile music recommendations. Inf. Sci. 180(11), 2142–2155
(2010). https://doi.org/10.1016/j.ins.2010.02.004
17. McNally, K., O’Mahony, M.P., Coyle, M., Briggs, P., Smyth, B.: A case study
of collaboration and reputation in social web search. ACM Trans. Intell. Syst.
Technol. 3(1), 1–29 (2011). https://doi.org/10.1145/2036264.2036268
18. Meldgaard, S.A., Kolsbjerg, E.L., Hammer, B.: Machine learning enhanced global
optimization by clustering local environments to enable bundled atomic energies.
J. Chem. Phys. 149(13) (2018). https://doi.org/10.1063/1.5048290
19. Mockus, J.: Bayesian Approach to Global Optimization, vol. 37. Springer Nether-
lands (1989). https://doi.org/10.1007/978-94-009-0909-0
20. Olofsson, S., Mehrian, M., Calandra, R., Geris, L., Deisenroth, M., Misener, R.:
Bayesian multi-objective optimisation with mixed analytical and black-box func-
tions: application to tissue engineering (2018). https://doi.org/10.1109/TBME.
2018.2855404
21. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O.,
Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al.: Scikit-learn: machine
learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
22. Perdikaris, P., Karniadakis, G.E.: Model inversion via multi-fidelity Bayesian
optimization: a new paradigm for parameter estimation in haemodynamics, and
beyond. J. R. Soc. Interface 13(118) (2016). https://doi.org/10.1098/rsif.2015.1107
23. Roustant, O., Ginsbourger, D., Deville, Y.: DiceKriging, DiceOptim: two R pack-
ages for the analysis of computer experiments by kriging-based metamodeling
and optimization. J. Stat. Softw. 51(1), 1–55. http://www.jstatsoft.org/v51/i01/
(2012)
24. Salakhutdinov, R., Mnih, A.: Probabilistic matrix factorization. In: Advances in
Neural Information Processing Systems (NIPS), pp. 1257–1264 (2008)
25. Shahriari, B., Swersky, K., Wang, Z., Adams, R.P., De Freitas, N.: Taking the
human out of the loop: a review of Bayesian optimization. Proc. IEEE 104, 148–
175 (2016). https://doi.org/10.1109/JPROC.2015.2494218
26. Takács, G., Pilászy, I., Németh, B., Tikk, D.: Scalable collaborative filtering
approaches for large recommender systems. J. Mach. Learn. Res. 10, 623–656
(2009). https://doi.org/10.1145/1577069.1577091
27. Vanchinathan, H.P., Nikolic, I., De Bona, F., Krause, A.: Explore-exploit in top-N
recommender systems via Gaussian processes. In: Proceedings of the 8th ACM
Conference on Recommender systems-RecSys 2014, No. June 2015, pp. 225–232.
(2014). https://doi.org/10.1145/2645710.2645733
Creation of Data Classification System
for Local Administration
Raissa Uskenbayeva, Aiman Moldagulova(&),

and Nurzhan K. Mukazhanov
International Information Technology University, Almaty, Kazakhstan

uskenbaevar@gmail.com,mukazhanovn@gmail.com,
a.moldagulova@iitu.kz
Abstract. This paper deals with classification of flow of messages coming to

the local government from various data sources, such as social networks,
website of the government, emails, etc. The primary data, which is extracted
from various sources, is stored in the NoSQL database. Further, using special
methods and developed applications, the data is classified and sent to the rele-
vant departments. The article focuses on the review of methods and the con-
struction of the architecture of the system of data classification which retrieved
from social networks, website of the local government, emails, etc.
Keywords: Data Database Structured data Unstructured data

Classification
1 Introduction
The sharp growth of quantity of unstructured data is led by the fast development of the
Web. According to the reports of many consulting companies, now about 70% of the
digital data which is gathered, stored and utilized by society is in an unstructured (text)
and semistructured form and only 30% generates other types of data. Therefore, the
problem of developing models, methods and construction of systems that allow effi-
cient processing of large data streams is essential. Textual data today have an
expanding variety of forms, and because of the involvement of computers in both
analysis and production, may be encountered in many formats.
The purpose of this article is to compare modern methods for solving the task of
classifying texts, detecting trends in the development of this direction and consider
development of architecture of the system that helps us for efficiently classifying of the
flows of data.
Nowadays a large number of methods and their different variations for the classi-
fication of texts have been developed. Each group of methods has its advantages and
disadvantages, areas of application, features and limitations.
Recently the interest in document classification and text mining has been renewed
and intensified by the accelerated increase in unstructured and semistructured data, due
to the spreading the Internet. Many domain of human activity are related to text
classification research. For example, the process of classifying scientific and popular
articles from on-line journals using constraint satisfaction method was described by
https://doi.org/10.1007/978-3-030-21803-4_76
762 R. Uskenbayeva et al.
Tran et al. [1], various approaches, in particular explicit rules, machine learning, and
linear discriminator analysis based methods to classification of a real time data set of
online employment offers gathered from different heterogeneous sources with a stan-
dard job classification system was applied and compared by Amato et al. [2].
Text classification is used in many spheres of human activity, from automatic
indexing of documents to document filtering, automatic metadata generation,
expanding the hierarchical directory of web resources, and structuring documents.
Office employees concentrate their working hours on ordinary, non-optimized work
related to the obligation of arranging electronic mails, messages, newsletters, chats,
releases, marketing statements, presentations, reviews and other documents that don’t
fit properly in relational databases, they can be stored as text files in various formats,
and these kinds of files may have an internal structure.
Moreover, text classification methods can be directed to problems on text docu-
ments analysis for the subject to containing certain criteria with further determining
identical documents, documents delaying the implementation of governmental
assignments, documents containing requests for background information and reports
which require a large amount of working hours thereby redirecting employees from the
essential work on processes of responding letters with unimportant contents.
2 Text Classification Methods
Classification is a standard task in the field of Data Mining. The purpose of the data
classification is to define for each object one or more predefined categories to which
this object belongs. A feature of the classification problem is the assumption that the set
of classified data does not contain “garbage”, that is, each object corresponds to some
given category [17].
A particular case of the classification problem is the separation of many messages
into its category. Classification of unstructured text data, as well as in the case of
classifying objects by a certain property, consists in assigning letters to one of the
previously known classes. Classification applied to the text data is often called text
categorization or rubricating. Obviously, this name comes from the task of system-
atizing text data from catalogs, categories and rubrics [17].
Let M is the set of possible messages:
M ¼ fm1 ; . . .; mi ; . . .; mn g: ð1Þ
Let C is the set of possible categories (classes):
C ¼ fcr g; ð2Þ
where r = 1,…,m.
The hierarchy of categories can be represented in the form of a set of pairs
reflecting the ratio of nesting between rubrics:
Creation of Data Classification System for Local Administration 763

H ¼ \cj ; cp [ ; cj ; cp C : ð3Þ
In the classification problem, it is required to construct a procedure that consists in

finding the most probable category from the set C for the investigated message mi.
Most of the methods for classifying texts are based on the assumption that docu-
ments belonging to the same category contain the same characteristics (words or
phrases), and the presence or absence of such features in the document indicates its
belonging or non-belonging to a particular topic.
Thus, for each category there should be a set of characteristics:
FðCÞ ¼ U ðcr Þ; ð4Þ
where, F(cr)=<f1, …, fk,…, fz.

This set of attributes is often called a dictionary, because it consists of tokens that
include words and/or phrases that characterize the category. And also each message can
have signs, by which it can be attributed with some degree of probability to one or
several categories:
F ðmi Þ f1; i ; . . .f1; i ; . . .fy; i : ð5Þ
The set of attributes of all letters must coincide with the set of attributes of
categories:
FðCÞ ¼ FðMÞ ¼ UF ðmi Þ: ð6Þ
It should be noted that these feature sets distinguish the classification of text data
from the classification of objects in Data Mining, which is characterized by a set of
attributes. The decision to assign mi messages to the category cr is taken on the basis of
the intersection
F ðmi Þ [ F ðcr Þ: ð7Þ
K-nearest neighbors (KNN), Naïve Bayes and Term Graph Model are text classi-
fication methods which were studied in the works of Bijalwan et al. [3]. During the
comparative research of the above mentioned text classification methods KNN showed
the highest accuracy [1]. Despite the fact that the performance of KNN algorithm is
low, it is broadly used in text classification, because KNN completely depends on every
sample of the training set [5].
KNN algorithm uses the Euclidean distance between the document vector and the
query vector to measure a document relevancy to a given query, which is quite
accurate. Elden [4] described a way to improve the performance of KNN by replacing
the term document matrix with a low-rank approximation in order to capture the
relevant information and remove the unimportant details in the documents. As it was
mentioned before KNN shows the best accuracy results among other text classification
methods. However, when comparing Naive Bayes, K-Nearest Neighbors and Support
Vector Machine classification methods to predict user’s personality, Naïve Bayes

shows better results based on data taken from Twitter [8]. This is because Naive Bayes
is a probabilistic classifier that uses pure probability calculations on existing features.
In specific cases due to high dimension of text vectors the KNN algorithm shows a
lower speed and applicability to text categorization [9]. This indicates that with a big
number of training data KNN requires more time to classify the documents.
Naïve Bayes. Naïve Bayes Classifier (NBC) uses methods of vector analysis. It fol-
lows the concepts of conditional probability of relevance of document d to a class of c.
Implementation and testing of NBC classifier is relatively simple, for this reason it is
used very frequently in automated text classification systems. At the same time NBC
has showed rather satisfactory results in comparison with other more sophisticated
classifiers [10].
PðAjBÞ ¼ PðBjAÞPðAÞ=PðBÞ: ð8Þ
For the given model a document is a vector d = {w1, w2,…, wn}, where wi –
weight of i-th term, and n – the size of dictionary of a sample set. Thus, according to
the Bayes theorem the probability of a class c for a document d is:
Pðcjd Þ ¼ PðcÞPðdjcÞ=PðdÞ: ð9Þ
Thus, it is computed the conditional probability for all classes.

Term Graph. Term Graph model uses the famous vector space model to represent a
text document as a relational tuple.
The basic steps are as follows [11]:
1. Preprocessing
2. Graph Building
Preliminary preprocessing involving parsing, cleaning and stemming of the text
document precedes text classification process. Subsequently a vector of terms with
corresponding frequencies is produced. Then a corresponding vector representing the
document can be constructed. As a result, a collection of documents can be represented
by a term document matrix, which can be subsequently interpreted as a relational table.
However, the vector space model saves only key features of the document and does not
take into consideration the relationship among terms [11].
KNN algorithm. K-nearest neighbors (KNN) is broadly used classification method. It
is easy for interpretation and its calculation time is very low. However, the assumption
of the value of the parameter k is very critical in this technique. KNN algorithm is
applied in many domain of text classification. For instance, in order to achieve good
governance and democracy it was proposed a system that addressed to classify com-
plaints of citizens involved to the process of city development [12]. KNN algorithm is
also found as theoretical background of application used to predict economic events
such as stock market, currency exchange rate, bank bankruptcies, financial risk, trading
futures, credit rating, loan management as well as bank customer profiling [9].
The KNN algorithm can be applied to email classification [13]. An approach for
building a machine learning system in R that uses K-Nearest Neighbors (KNN) method
for the classification of textual documents from two sources: http://egov.kz and http://
www.government.kz was represented in [19].
There are numerous modifications of KNN algorithm. One of the issues of KNN
algorithm is to reduce the sparsity of the term-document matrix [18]. A flexible KNN
algorithm with combination of weighting algorithm and K-variable algorithm enhances
the efficiency of text classification [14]. A combination of eager learning with KNN
classification [6] improved increased the accuracy and the efficiency of classification.
A novel KNN classification algorithm combining evidence theory and model assists to
succeed in dealing with the shortage of time-consuming [7].
In order to increase the classification accuracy and reduce time consumption many
researchers have made attempts to combine KNN algorithm with various classification
techniques. In spite of easiness of using KNN and its efficiency in general, the per-
formance of KNN algorithm depends on mostly the allocation of the training set. It was
proposed a modified KNN algorithm based on integration the density of the test sample
and the density of its nearest neighbors taking into account the unevenness of the
textual data distribution [10, 15]. In order to decline the effect of the uneven data
distribution to the classification the distance between the test sample and samples in the
sparse area was intensified and the distance between the test sample and samples in the
dense area was reduced. An algorithm based on clustering the training samples making
a relatively uniform distribution of training samples which assists to solve the problem
of the uneven distribution of training samples was presented [15].
In order to improve the performance of KNN classifier TFKNN (Tree-Fast-K-
Nearest-Neighbor) method based on similarity search tree was proposed [16]. This
approach shows how to search the exact k nearest neighbors and improve time con-
suming drawback.
3 The Architecture of the Data Classification System

for the Local Administration
As mentioned above, the system of the classification of letters (messages), questions

coming to the local government from various data sources will be considered in this
paper.
The system consists of several structural parts, such as primary data, obtaining data
from primary data sources, processing unstructured data (classification of data), storing
classified data, and visualization of the data in various forms. The scheme of the system
operation is shown in Fig. 1.
Primary sources of data contain questions from the section “Questions to the mayor
of the city” on the website of the local government of Almaty, letters received by e-
mail, posts in social networks, etc. Data is downloaded from primary sources to
NoSQL database through Java-applications. After completing uploading data to the
data warehouse (DW), the data is classified according to different topics (relevant areas)
and classified data are in a structured form.
Fig. 1. System operation scheme
Next stage, data is presented in various forms. Moreover we can build a variety of
analytical reports according to the user needs. For example, you can use the tools
Business Analytics and others. The architecture of the system is presented in Fig. 2.
Fig. 2. Architecture of the data classification system
Let us discuss about components of data classification system in detail.

Search for data. In the first step, it is necessary to identify which data should be
classified and ensure their availability. Typically, users can determine the set of
analyzed sources independently i.e. manually, but with a large amount of data, it is
necessary to use automated selection options according to specified criteria.
Extracting and loading data. Extraction of information from selected sources involves
the allocation of the necessary data, over which the classification will be carried out in
the future. After this step, the downloading data is performed.
Classification of the data coming into local government. Next, we will consider the
classification of data by category. The main task of classification is the grouping of text
data on subjects (education, medicine, etc.). Methods for classifying unstructured text
data lie at the junction of several areas: retrieval of data from various sources,
extraction and loading of data, Data Mining. Classification task was discussed earlier in
Sect. 2 of this paper.
The task of classification methods is to best select such characteristics and for-
mulate rules on the basis of which a decision will be made to assign messages to this
rubric [17].
After completing the classification of text messages data visualization is performed.
In our case, the data is presented in text form. If necessary, you can represent data in
various forms, such as tabular, graphical, etc. Also you can use ready-made business
intelligence tools for various visualizations of classified data.
4 Conclusion
In this paper, we considered the management of data flows coming to local government
from various data sources, in the messages form (letters), questions, etc. The main task
was the classification of messages, questions, etc. by category. There were presented
the scheme of the system operation for local government administrations and its
architecture.
The system can be used to provide communication of citizens with government
services and institutions. The system will carry out professional processing of incoming
calls to administrative institutions and include functions to ensure the reception of calls
through all types of communication channels (call centers, email, a web portal, social
networks), high-quality and fast processing of all calls, taking into account the priority
of requests and established procedural deadlines.
The article provides an overview of data classification methods, a scheme and
description of the system architecture for classifying of data flows, as well as a formal
description of the classification of text messages.
Acknowledgment. This work has been done in the framework of the grant given by Ministry of
Education and Science of the Republic of Kazakhstan (Grant No. 0218PК01178).
References
1. Tran, L.Q., Moon, C.W., Le, D.X., Thoma, G.R.: Web page downloading and classification.
In: Proceedings 14th IEEE Symposium on Computer-Based Medical Systems. CBMS 2001,
pp. 321–326 (2001). https://doi.org/10.1109/cbms.2001.941739
2. Amato, F., Boselli, R., Cesarini, M., Mercorio, F., Mezzanzanica, M., Moscato, V.,
Picariello, A.: Challenge: processing web texts for classifying job offers. In: Proceedings of
the 2015 IEEE 9th International Conference on Semantic Computing, IEEE ICSC 2015,
pp. 460–463 (2015). https://doi.org/10.1109/icosc.2015.7050852
3. Bijalwan, V., Kumar, V., Kumari, P., Pascual, J.: KNN based machine learning approach for
text and document mining 7(1), 61–70 (2014)
4. Elden, L.: Matrix Methods in Data Mining and Pattern Recognition. SIAM, Philadelphia,
PA, 224 pp. (2007). ISBN 978-0-898716-26-9
5. Hassanat, A.B., Abbadi, M.A., Alhasanat, A.A.: Solving the Problem of the K Parameter in
the KNN Classifier Using an Ensemble Learning Approach. Int. J. Comput. Sci. Inf. Secur.
(IJCSIS) 12(8), 33–39 (2014). https://doi.org/10.1007/s00500-005-0503-y
6. Dong, T., Cheng, W.: The research of kNN text categorization algorithm based on eager
learning, (d), pp. 1120–1123 (2012). https://doi.org/10.1109/icicee.2012.297
7. Guo, G., Ping, X., Chen, G.: A fast document classification algorithm based on improved
KNN, pp. 3–6 (2006)
8. Pratama, B.Y., Sarno, R.: Personality classification based on Twitter text using Naive Bayes,
KNN and SVM. In: 2015 International Conference on Data and Software Engineering
(ICoDSE), pp. 170–174 (2015). https://doi.org/10.1109/icodse.2015.7436992
9. Yan, Z.: Combining KNN algorithm and other classifiers, (1), 1–6 (2010)
10. Shimodaira, H.: Text classification using Naive Bayes, (4) (2015)
11. Wang, L., Zhao, X.: Improved KNN classification algorithms research in text categorization,
i, pp. 1848–1852 (2012)
12. Tjandra, S., Alexandra, A., Warsito, P.: Determining citizen complaints to the appropriate
government departments using KNN algorithm, pp. 2–5 (2015)
13. Nikhath, A.K., Subrahmanyam, K., Vasavi, R.: Building a K-nearest neighbor classifier for
text categorization 7(1), 254–256 (2016)
14. Yunliang, Z., Lijun, Z., Xiaodong, Q., Quan, Z.: Flexible KNN algorithm for text
categorization by authorship based on features of lingual conceptual expression, pp. 601–
605 (2009). https://doi.org/10.1109/csie.2009.363
15. Zhou, L., Wang, L.: A Clustering-based KNN improved algorithm CLKNN for text
classification, pp. 4–7 (2010)
16. Wang, Y.U., Wang, Z.: A fast KNN algorithm for text categorization, 19–22 Aug 2007
17. Barsegiyan, A.A.: Tekhnologii analiza dannykh: Data Mining, Visual Mining, Text Mining,
OLAP / A.A. Barsegyan, M.S. Kupriyanov, V.V. Stepanenko, I.I. Kholod – 2-ye izd.,
pererab. i dop. – SPb.: BKHV-Peterburg, 384 p. (2007)
18. Moldagulova, A.N., Sulaiman, R.B.: Document classification based on KNN algorithm by
term vector space reduction. In: Proceedings of 18th International Conference on Control,
Automation and Systems (ICCAS) (2018)
19. Moldagulova, A.N., Sulaiman, R.B.: Using KNN algorithm for classification of textual
documents. In: Proceedings of 8th International Conference on Information Technology
(ICIT) (2017)
Face Recognition Using Gabor Wavelet
in MapReduce and Spark
Anh-Cang Phan1(B) , Hung-Phi Cao1 , Ho-Dat Tran1 , and Thuong-Cang Phan2

1
Vinh Long University of Technology Education, Vinhlong, Vietnam
{cangpa,caohungphi,datth}@vlute.edu.vn
2
Can Tho University, Can Tho, Vietnam
ptcang@cit.ctu.edu.vn
Abstract. Face recognition has become one of the important research

areas and is used in wide range of applications. In addition to accuracy,
traditional face recognition methods face challenges on time-consuming
to identify and apply to distributed systems in a large data environment.
To solve these problems, we proposed a facial recognition method using
the Gabor wavelet technique and the MapReduce parallel processing
model. We performed parallel processing at the extraction and recogni-
tion stage with the MapReduce model in the Spark environment. Experi-
mental results show that the proposed method significantly improves the
computing time and the accuracy of face recognition.
Keywords: Face recognition · Gabor wavelet · MapReduce · Spark
1 Introduction
Nowadays, with the development of society, we see that the field of information
technology is present in all aspects of life such as economics, politics, culture,
entertainment, etc. The issue of user privacy is an indispensable requirement and
this field is currently got a lot. Data security and privacy are always a top concern
for computer users and information authentication systems. If we do not have
the appropriate protection solutions, while interacting with a global network, the
ability to take control over the data will be higher. Therefore, information secu-
rity plays a very important role in authentication systems. In order to authenti-
cate someone, we usually use magnetic cards, passwords, passports, etc. However,
these methods are at the risk of information theft. Currently, the identification
system has been studied and given the increasing reliability contribute to solving
information security issues. The introduction of facial recognition systems has
given many benefits. Face recognition is considered one of the most common and
important methods of biometric identification. This method is capable of identi-
fying someone through their facial features. Automatic face recognition systems

https://doi.org/10.1007/978-3-030-21803-4_77
770 A.-C. Phan et al.
have been extensively studied in recent years because of its role in access control
systems or real-time monitoring systems [3]. There have been many important
accomplishments in research. Turk and Pentland [4] presented a near real-time
facial recognition system by introducing eigenface technique in the extraction of
facial features. Wang et al. [5] proposed an effective facial recognition technique
using the Principal component analysis (PCA) method and machine learning
algorithm Support Vector Machine (SVM). In general, many methods have been
proposed to solve facial recognition problems.
Typically, there are 2 methods to recognize face image: comprehensive review
face and identifying through geometrical characteristic of the facial details.
Firstly, identity is based on a comprehensive review face, this method uses algo-
rithms such as PCA, LDA, ICA, wavelet transform and so on in order to extract
the principal features on the face. Secondly, the identification is made through
the identification of the geometrical characteristics of the facial details like loca-
tion, size, shape of the eyes, nose, mouth, etc and the relationship between
the details such as the distance between two eyes, eyebrows distance. In this
paper, we propose a facial recognition method using Gabor wavelet transform
and MapReduce parallel processing model at the training and identification stage
to improve the response time of the system.
2 Related Works
Bellakhdhar et al. [1] have proposed a methodological improvement to raise face

recognition rate by fusing the phase and magnitude of Gabor’s representations
of the face as a new representation. The performance of the proposed algo-
rithm is tested on the public and largely used databases of FRGCv2 face and
ORL databases. Experimental results on databases show that the combination
of the magnitude with the phase of Gabor features can achieve promising results.
Gervei et al. [2] have used present an approach for 3D face recognition based
on extracting principal components of range images by utilizing modified PCA
methods namely 2DPCA and bidirectional 2DPCA. Method of 2DPCA is quite
simple. It is supposed that A=[A1, A2,...,An], then 2DPCA was found eigenval-
ues and eigenvectors of the matrix. The vectors corresponding to the eigenvalues
largest will be the fundamental for the new space. Maillo et al. [6] has proposed a
MapReduce-based approach for k-Nearest neighbor classification. The proposed
approach allows us to simultaneously classify large amounts of unseen cases. To
do so, the map phase will determine the k-nearest neighbors in different splits of
the data. Afterwards, the reduce stage will compute the definitive neighbors from
the list obtained in the map phase. Bagwe et al. [11] have proposed a method
for face detection using Hadoop Map-Reduce Framework. In order to apply face
detection algorithm to each image, map function has to get the whole image
contents as a single input record.
Face Recognition Using Gabor Wavelet in MapReduce and Spark 771
3 Background
3.1 Feature Extraction
1. Principal Component Analysis

Principle Components Analysis (PCA) is an algorithm used to create a new
image from the original image. PCA is the image recognition algorithm based
on the overall features of the face, we apply this algorithm to perform two
facets: The first is find a face similar to the given face. The second is to locate
human faces in a photo.
2. Representation Images Using Gabor Wavelet
Let I(z) is the grayscale value of the pixel z. Convolution of I(z) with a Gabor
filter, Ψμν (z) is defined as formula 1.
Oμν (z) = I(z) ∗ Ψμν (z) (1)
Value of Oμν (z) is a characteristic of Gabor at position z with the direction

μ and the rate ν. A Gabor feature vectors representing the entire image is
obtained by pairing the lines of all the values Oμν (z) , ∀v ∈ {0, 1, 2, 3, 4}, μ ∈
{0, ..., 7}. We call G(I) is the Gabor feature vectors of the image I, formula 2.
G (I) = (O00 , O01 , ..., O10 , ..., O47 ) (2)
3.2 K-Nearest Neighbors (KNN)
KNN [7] is an algorithm classifying objects based on the closest distance between
objects to be classified and all objects in the training set. This is the simplest
algorithm in the machine systems. The KNN steps are described as follows:
– Step 1: Define the parameter value of K (nearest neighbor number).

– Step 2: Compute the distance between objects to be classified and those in
the training set (based on Euclidean distance).
– Step 3: Arrange the distance in ascending order and identify the K - nearest
neighbor with objects to be classified.
– Step 4: Rely on the class of the nearest neighbor to identify the class of
objects to be classified.
3.3 Spark and Hadoop MapReduce
Spark has provided a general execution model that allows optimizing arbitrary
operator graphs. And it supports in-memory computing, which lets it query data
faster than disk-based engines like MapReduce. Spark provides a kit of API for
Scala, Java, and Python languages. In some experimental results, Spark can be
as much as 10 times to 100 times faster than Hadoop [8–10]. Apache Spark
model deployment such as Standalone, Hadoop Yarn, Apache Mesos,... Spark
Core is a component of Spark: it provides the most basic functions of Spark
as scheduling tasks, memory management, fault recovery, interact with storage

systems,... Especially, Spark Core provides an API to define RDD (Resilient
Distributed DataSet) are item sets that are distributed among the nodes of
cluster and can be implementing parallel processing (Fig. 1).
Fig. 1. The Spark stack. [12]
Hadoop is a support platform enabling the distributed processing of large

data sets on computers. Hadoop offers the distributed file system (HDFS) and
support MapReduce model enabling applications to work by multiple nodes with
data petabytes.
The HDFS system with two type nodes: NameNode or Master node, and
DataNode or Worker node. NameNode manages the file system namespace. It
maintains the file system tree and metadata for all files [13] and folders in the
tree. NameNode identifies the DataNode on which all blocks for a file are dis-
tributed. The DataNodes store and take these blocks when they are called (by
the user or from the master node). To solve the large data problem, the data of
large files are split into small blocks and distributed on the storage nodes. Com-
pared with other file systems, HDFS is not optimal for the problem of storing
billions of small files, each file has just a few KB size. The advantages of large
file design are to reduce the load of the file space management system, the time
spent on folders or search files.
4 Our Proposed Method
4.1 An Overview of Our Approach
In this paper we develop a face recognition system using Gabor wavelet trans-
form and MapReduce parallel computing model. The proposed method consists
of two phases: training and recognition. Figure 2 describes the face recognition
system using Gabor wavelet transform and MapReduce. First, at the training
stage, we used the Gabor filter to extract the facial features. Extracted features
will be stored on the Hadoop distributed file system (HDFS). At the identity
stage, we use the KNN algorithm to predict labels and give identification results.
This process is performed under the MapReduce mechanism to improve com-
putational speed. The basic problem with facial recognition is the use of Gabor
filters to extract features. Instead of using the facial features schema, high energy
points will be used to compare faces, which not only reduce the volume of calcu-
lation but also increase the accuracy of the algorithm because it does not need
to identify the feature manually.
Fig. 2. The proposed model for face recognition.
4.2 Feature Extraction Model of Facial Image
At the training stage, from the set of training photos, we performed the facial
feature extraction by Gabor wavelet transform and stored in the database. The
feature extraction algorithm consists of three main steps: Step 1: Detect and
extract face on image. Step 2: Determine position of feature points on face image
by using the Gabor wavelet filter. Step 3: Generate feature vector and save onto
the HDFS.
Fig. 3. Training model using MapReduce in Spark environment.

Figure 3 describes the training module. Input data of map progress is a list of
face images stored on HDFS. From this dataset, Tasktracker will generate sets of
records that are key (the label of the image) and value (the content of the image).
With this set of records, Tasktracker will loop to retrieve each record as input for
the map function in order to return results that are intermediate key and value.
The output of the map function will be sorted inside the main memory and then
written to the local disk. After the mapping process is completed, its output (set
of intermediate key(image label) and value (feature vector) pairs) will be input
to the reduce process. In the training module, the reduce process only performs
data merging from the mapping process. The final data is a collection of records
(each record is a (image label, feature vector)) that will be stored on HDFS.
From the response of the face through Gabor wavelet filters, we will find the
features according to the following method:
A point (x0 , y0 ) is characteristic if the following conditions are satisfied:
– Rj (x0 , y0 ) = max (Rj (x, y))
(x,y)∈W0
N1
N2
– Rj (x0 , y0 ) > 1
N1 N2 Rj (x, y), j = 1, ..., 40
x=1 y=1
– Rj is the response of the face image for Gabor filters j. N1 N2 are the size of
the facial image. W0 is a square has the edge W xW pixel core at x0 , y0 .
In this algorithm, dimensions W is a key parameter. It must be chosen small
enough to obtained the important characteristics, and large enough to avoid
excess. In this paper, we chose W = 9 to find features points of the face through
the response to Gabor filters.
Save Feature Vectors: Feature vectors are created at specific points

in the form of the combinations of the coefficients of Gabor wavelet fil-
ter. Feature vector k th of the ith face image is defined as follows: vi,k =
{xk , yk , Ri,j (xk , yk ) , j = 1, ..., 40}. With 40 Gabor wavelet filters, a feature vec-
tor has 42 components. The first two components of the feature vector repre-
sentation of the characteristic location by coordinates (x,y). The 40 next com-
ponents is the response value of the face image at the corresponding position
for Gabor wavelet filters. Feature vector allows expression structure, spatial fre-
quency and spatial relationships of local image regions surrounding the respective
features.
4.3 Face Recognition Model with MapReduce in Spark

Environment
From the database of feature vectors of training images. We perform face recogni-
tion using the KNN machine learning method. For face recognition, we perform
the following operations: Extract face in image. Extract features of the face.
Perform face recognition using the KNN method.
In this paper, we have implemented paralellize the KNN algorithm based on
MapReduce. Figure 4 describes the face recognition module by using the KNN
Fig. 4. Recognition model using MapReduce in Spark environment.
algorithm on MapReduce. The mapping phase calculates the distance of each

test image to k nearest neighbors in different split of training data. The reduce
phase will merge the distance of the k nearest neighbor from all maps and create
a list of nearest neighbors by taking the neighbors with the smallest distance.
Then, it perform the majority voting in order to predict the classification results.
* Map Phase: We have FTR includes m splits (F T R1 , F T R2 , ..., F T Rm ) is a

training set and FT includes n features vertor (v1 , v2 , ..., vn ) is the test set. The
pairs (F T Ri , F T ) are the inputs of the maps function. Each map function will
calculate the distance of each vj with the samples of F T Ri . The class label and
the distance of k (determined by the user) nearest neighbors for each sample will
be saved. As a result we will get a EDi matrix include “class, distance” pairs
with dimension n*k. Therefore, row i of the matrix will contain the distance
of ith sample test with k nearest neighbors. It is noteworthy that each row
will be sorted in ascending order of distance (Distneigh1 < Distneigh2 < .... <
Distneighk ).
* Reduce Phase: Due to the matrices of all the process map are the same
size. We will perform merge these matrices by browsing through each element
of all matrices. After that, we get the class label and the distance of the ele-
ment with the smallest distance at the position being considered in all matrices.
After merging all the EDi matrices we obtain a single matrix EDReducer . The
EDReducer will contain the definitive list of neighbors (class and distance) for
all the examples of FT. We perform the majority voting of the KNN model and
determine the predicted classes for F T . As a result, the predicted classes for all
the FT set are provided as the final output of the reduce phase.
5 Experimental and Results

5.1 Testing and Result
In this paper, we conducted experiments for the two data set image, that
is: AT&T (of AT&T Laboratories Cambridge) and Yale (of UCSD Computer
Vision).For each data set, we divided images into 2 subsets: Training set (about
70%), Testing set (about 30%) (Table 1).
Table 1. Number of training and testing files.
Face database Total The amount of training The amount of testing

Yale 165 105 60
AT&T 400 280 120
For the KNN algorithm, the critical parameter k is determined as shown in

Table 2.
Table 2. The accuracy corresponds to k values when using KNN algorithm.
k - parameters Accuracy (%)

Eigenfaces Gabor wavelet
AT&T YALE AT&T YALE
1 87.50 80.00 95.83 93.33
2 82.50 73.33 86.67 93.33
3 80.83 75.00 85.83 86.67
4 79.17 78.33 81.67 86.67
5 78.33 71.67 80.00 85.00
6 73.33 81.67 78.33 83.33
7 72.50 80.00 75.00 83.33
From Table 2, the optimal k value equals 1 because the accuracy of identifica-
tion is highest in all cases. To assess the system’s execution time, we compared
facial recognition method that was not parallel to the MapReduce method in
Spark environment. We used the Gabor wavelet with KNN method for facial
recognition. Table 3 shows the execution time using the non-parallel recognition
method and the MapReduce method in the Spark environment. The results show
that using the MapReduce method will significantly improve recognition time.
Table 3. Compare execution time to recognition.
Datasets Methods Execution time (second)

AT&T Non parallel 163
MapReduce in the Spark environment 60
Yale Non parallel 63
MapReduce in the Spark environment 30
6 Conclusion
Face recognition is an attractive area for the study of the nervous system and
visual research on the computer. Humans have the ability to recognize a famil-
iar face with ease due to the visual and cerebral hemisphere crust, but human
memory capacity to the limit. The studies used to illustrate the possibilities for
computer-based identification was created with advantages is the large mem-
ory capacity. MapReduce is a framework for writing applications using parallel
processing large amounts of data with high fault tolerance across thousands of
computing clusters. Apache Spark is an open source system that enables the
construction of rapid prediction models with calculations performed on a set of
computers to provide fast data analysis. Identification method based on extract-
ing features are designed to reduce the problem of storing the data too large,
and Gabor wavelet transform is suitable for extracting features. In this paper,
we have developed a face recognition system which using Gabor filters to extract
features of the face. We performed parallel processing at the extraction stage and
facial recognition stage with the MapReduce model in the Spark environment.
Therefore, the operations of processing and storing intermediate data are per-
formed on the main memory (RAM). In further research, we will work on larger
data sets with the number of large system nodes that can perform real-time
recognition.
References
1. Bellakhdhar, F., Loukil, K., Abid, M.: Face recognition approach using Gabor
Wavelets, PCA and SVM. Int. J. Comput. Sci. Issues (IJCSI), 10(2) (2013)
2. Gervei, O., Ayatollahi, A., Gervei, N.: 3D face recognition using modified PCA
methods. World Acad. Sci. Eng. Technol. 39 (2010)
3. Jain, A.K., Klare, B., Park, U.: Face recognition: some challenges in forensics. In:
2011 IEEE International Conference on Automatic Face & Gesture Recognition
and Workshops (FG 2011). IEEE (2011)
4. Turk, M.A., Pentland, A.P.: Face recognition using eigenfaces. In: IEEE Computer
Society Conference on Computer Vision and Pattern Recognition, Proceedings
CVPR 1991. IEEE (1991)
5. Wang, C., et al.: Face recognition based on principle component analysis and sup-
port vector machine. In: 3rd International Workshop on Intelligent Systems and
Applications (ISA). IEEE (2011)
6. Maillo, J., Triguero, I., Herrera, F.: A mapreduce-based k-nearest neighbor app-
roach for big data classification. In: Trustcom/BigDataSE/ISPA, 2015, vol. 2. IEEE
(2015)
7. Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. The-
ory 13(1), 21–27 (1967)
8. Liu, L.: Performance Comparison by Running Benchmarks on Hadoop, Spark, and
HAMR. University of Delaware, Diss (2015)
9. Ranjani Priya, A.C., Sridhar, Dr, M.: Spark–an efficient framework for large scale
data analytics. Int. J. Sci. Eng. Res. (2016)
10. Zaharia, M., et al.: Spark: cluster computing with working sets. HotCloud 10(10-
10) (2010)
11. Bagwe, T., Darji, N., Gunjal, J., Vanjari, N.: Face Detection Using hadoop map-
reduce framework. Int. J. Res. Advent Technol. (2015)
12. Karau, H., et al.: Learning spark: lightning-fast big data analysis. O’Reilly Media,
Inc. (2015)
13. Divakar, M.A., Arakeri, M.P.: User authentication system using multimodal bio-
metrics and MapReduce. In: Information and Communication Technology for Sus-
tainable Development. Springer, Singapore, pp. 71–82 (2018)
Globally Optimal Parsimoniously Lifting
a Fuzzy Query Set Over a Taxonomy Tree
Dmitry Frolov1(B) , Boris Mirkin1,2 , Susana Nascimento3 , and Trevor Fenner2

1
Department of Data Analysis and Artificial Intelligence, National Research
University Higher School of Economics, Moscow, Russian Federation
dfrolov@hse.ru
2
Department of Computer Science and Information Systems, Birkbeck University of
London, London, UK
3
Department of Computer Science and NOVA LINCS, Universidade Nova de Lisboa,
Caparica, Portugal
Abstract. This paper presents a relatively rare case of an optimiza-

tion problem in data analysis to admit a globally optimal solution by a
recursive algorithm. We are concerned with finding a most specific gen-
eralization of a fuzzy set of topics assigned to leaves of domain taxonomy
represented by a rooted tree. The idea is to “lift” the set to its “head
subject” in the higher ranks of the taxonomy tree. The head subject
is supposed to “tightly” cover the query set, possibly bringing in some
errors, either “gaps” or “offshoots” or both. Our method globally min-
imizes a penalty function combining the numbers of head subjects and
gaps and offshoots, differently weighted. We apply this to a collection of
17645 research papers on Data Science published in 17 Springer journals
for the past 20 years. We extract a taxonomy of Data Science (TDS)
from the international Association for Computing Machinery Comput-
ing Classification System 2012. We find fuzzy clusters of leaf topics over
the text collection, optimally lift them to head subjects in TDS, and
comment on the tendencies of current research following from the lifting
results.
Keywords: Hierarchical taxonomy · Parsimony · Generalization ·

Additive fuzzy cluster · Spectral clustering · Annotated suffix tree
1 Introduction
The issue of automation of structurization and interpretation of digital text

collections is of ever-growing importance because of both practical needs and
theoretical necessity. This paper concerns an aspect of this, modeling general-
ization as a unique feature of human cognitive abilities. The existing approaches
to computational analysis of structure of text collections usually involve no gen-
eralization as a specific aim. The most popular tools for structuring text collec-
tions are cluster analysis and topic modelling. Both involve items of the same
https://doi.org/10.1007/978-3-030-21803-4_78
780 D. Frolov et al.
level of granularity as individual words or short phrases in the texts, thus no

generalization as an explicitly stated goal.
Nevertheless, the hierarchical nature of the universe of meanings is reflected
in the flow of publications on text analysis. We can distinguish between at least
three directions at which the matter of generalization is addressed.
First of all, there are activities related to developing taxonomies, especially
those involving hyponymic/hypernymic relations (see, for example, [11,14], and
references therein). A recent paper [12] is devoted to supplementing a taxonomy
with newly emerging research topics.
Another direction is part of conventional activities in text summarization.
Usually, summaries are created using a rather mechanistic approach of sen-
tence extraction. There is, however, also an approach for building summaries as
abstractions of texts by combining some templates such as subject-verb-object
(SVO) triplets (see, for example, [5]).
One more direction is what can be referred to as “operational” generaliza-
tion: the authors use generalized case descriptions involving taxonomic relations
between generalized states and their parts to achieve a tangible goal such as
improving characteristics of text retrieval (see, for example, [8,13].)
This paper falls in neither of these directions, as we do not try to change
any taxonomy. We rather use a taxonomy for straightforwardly implementing
the idea of generalization. According to the Merriam-Webster dictionary, the
term “generalization” refers to deriving a general conception from particulars.
We assume that a most straightforward medium for such a derivation, a domain
taxonomy, is given as a rooted tree whose nodes are labeled by topics of the
domain. The situation of our concern is a case at which we are to generalize
a fuzzy set of taxonomy leaves representing the essence of some empirically
observed phenomenon. The most popular Computer Science taxonomy is manu-
ally developed by the world-wide Association for Computing Machinery, a most
representative body in the domain; the latest release of the taxonomy has been
published in 2012 as the ACM Computing Classification System (ACM-CCS)
[1]. We take its part related to Data Science, as presented in a slightly modified
form by adding a few leaves in [4].
The rest of the paper is organized accordingly. Section 2 presents a mathe-
matical formalization of the generalization problem as of parsimoniously lifting
of a given fuzzy leaf set to higher ranks of the taxonomy and provides a recur-
sive algorithm leading to a globally optimal solution to the problem. Section 3
describes an application of this approach to deriving tendencies in development
of the data science, that are discerned from a set of about 18,000 research papers
published by the Springer Publishers in 17 journals related to Data Science for
the past 20 years. Its subsections describe our approach to finding and gener-
alizing fuzzy clusters of research topics. In the end, we point to tendencies in
the development of the corresponding parts of Data Science, as drawn from the
lifting results.
Globally Optimal Parsimoniously Lifting a Fuzzy Query Set 781
2 Parsimoniously Lifting a Fuzzy Thematic Subset in

Taxonomy: Model and Method
Mathematically, a taxonomy is a rooted tree whose nodes are annotated by
taxonomy topics. We consider the following problem. Given a fuzzy set S of tax-
onomy leaves, find a node t(S) of higher rank in the taxonomy, that covers the
set S in a most specific way. Such a “lifting” problem is a mathematical expli-
cation of the human facility for generalization, that is, “the process of forming
a conceptual form” of a phenomenon represented, in this case, by a fuzzy leaf
subset.
The problem is not as simple as it may
seem to be. Consider, for the sake of sim-
plicity, a hard set S shown with five black
leaf boxes on a fragment of a tree in Fig. 1.
Figure 2 illustrates the situation at which
the set of black boxes is lifted to the root, Fig. 1. A crisp query set, shown by
which is shown by blackening the root black boxes, to be conceptualized in the
taxonomy.
box, and its offspring, too. If we accept
that set S may be generalized by the root,
this would lead to a number, four, white
boxes to be covered by the root and, thus,
in this way, falling in the same concept as
S even as they do not belong in S. Such a
situation will be referred to as a gap. Lift-
ing with gaps should be penalized. Alto-
gether, the number of conceptual elements
introduced to generalize S here is 1 head Fig. 2. Generalization of the query set
subject, that is, the root to which we have from Fig. 1 by mapping it to the root,
assigned S, and the 4 gaps occurred just with the price of four gaps emerged at
because of the topology of the tree, which the lift.
imposes this penalty. Another lifting deci-
sion is illustrated in Fig. 3: here the set is
lifted just to the root of the left branch of
the tree. We can see that the number of
gaps has drastically decreased, to just 1.
However, another oddity emerged: a black
box on the right, belonging to S but not
Fig. 3. Generalization of the query set
covered by the root of the left branch at from Fig. 1 by mapping it to the root
which the set S is mapped. This type of of the left branch, with the price of one
error will be referred to as an offshoot. At gap and one offshoot emerged at this
this lifting, three new items emerge: one lift.
head subject, one offshoot, and one gap.
This is less than the number of items emerged at lifting the set to the root (one
head subject and four gaps, that is, five), which makes it more preferable. Of
course, this conclusion holds only if the relative weight of an offshoot is less than
the total relative weight of three gaps.
We are interested to see whether a fuzzy set S can be generalized by a node t

from higher ranks of the taxonomy, so that S can be thought of as falling within
the subtree rooted at the node t. The goal of finding an interpretable pigeon-
hole for S within the taxonomy can be formalized according to the Maximum
Parsimony (MP) principle: find one or more “head subjects” t to cover S with
the minimum number of the elements introduced at the generalization: head
subjects and gaps and offshoots.
Consider a rooted tree T representing a hierarchical taxonomy so that its
nodes are annotated with key phrases signifying various concepts. We denote
the set of all its leaves by I. The relationship between nodes in the hierarchy is
conventionally expressed using genealogical terms: each node t ∈ T is said to be
the parent of the nodes immediately descending from t in T , its children. We use
χ(t) to denote the set of children of t. Each interior node t ∈ T − I is assumed to
correspond to a concept that generalizes the topics corresponding to the leaves
I(t) descending from t, viz. the leaves of the subtree T (t) rooted at t, which is
conventionally referred to as the leaf cluster of t.
A fuzzy set on I is a mapping u of I to the non-negative real numbers that
assigns a membership value, or support, u(i) ≥ 0 to each i ∈ I. We refer to
the set Su ⊂ I, where Su = {i ∈ I : u(i) > 0}, as the base of u. In general, no
other assumptions are made about the function u, other than, for convenience,
commonly limiting it to not exceed unity. Conventional, or crisp, sets correspond
to binary membership functions u such that u(i) = 1 if i ∈ Su and u(i) = 0
otherwise.
Given a fuzzy set u defined on the set of leaves I of the tree T , one may
consider u to be a (possibly noisy) projection of a higher rank concept, u’s
“head subject”, onto the corresponding leaf cluster. Under this assumption, there
should exist a head subject node h among the interior nodes of T such that its
leaf cluster I(h) more or less coincides (up to small errors) with Su . This head
subject is the generalization of u to be found. The two types of possible errors
associated with the head subject, if it does not cover the base of u precisely, are
false positives and false negatives, referred to in this paper, as gaps and offshoots,
respectively. They are illustrated in Figs. 2 and 3. Given a head subject node h,
a gap is a node t covered by h but not belonging to the base of u, so that
u(t) = 0. In contrast, an offshoot is a node t such that u(t) > 0 but not covered
by h. Altogether, the total number of head subjects, gaps, and offshoots has to
be as small as possible. To this end, we introduce a penalty for each of these
elements. Assuming for the sake of simplicity, that the black box leaves on Fig. 1
have membership function values equal to unity, one can easily see that the total
penalty at the head subject raised to the root (Fig. 2) is equal to 1 + 4λ where
1 is the penalty for a head subject and λ, the penalty for a gap, since the lift on
Fig. 2 involves one head subject, the root, and four gaps, the blank box leaves.
Similarly, the penalty for the lift on Fig. 3 to the root of the left-side subtree is
equal to 1 + γ + λ where γ is the penalty for an offshoot, as there is one copy
of each, head subject, gap, and offshoot, in Fig. 3. Therefore, depending on the
relationship between γ and λ either lift on Fig. 2 or lift on Fig. 3 is to be chosen.

That will be the former, if 3λ < γ, or the latter, if otherwise.
A node t ∈ T is referred to as u-irrelevant if its leaf-cluster I(t) is disjoint
from the base Su . Obviously, if a node is u-irrelevant, all of its descendants are
also u-irrelevant. Consider a candidate node h in T and its meaning relative to
fuzzy set u. An h-gap is a node g of T (h), other than h, at which a loss of the
meaning has occurred, that is, g is a maximal u-irrelevant node in the sense that
its parent is not u-irrelevant. Conversely, establishing a node h as a head subject
can be considered as a gain of the meaning of u at the node. The set of all h-gaps
will be denoted by G(h).
A gap is less significant if its parent’s membership value is smaller. Therefore,
a measure v(g) of “gap importance” should also be defined, to be reflected in the
penalty function. We suggest defining the gap importance as v(g) = u(par(g)),
where par(g) is the parent of g. An alternative definition would be to scale these
values by dividing them by the number of children of par(g). However, we note
that the algorithm ParGenFS below works for any definition
of gap importance.
Also, we define a summary gap importance: V (t) = g∈G(t) v(g).
An h-offshoot is a leaf i ∈ Su which is not covered by h, i.e., i ∈ / I(h). The
set of all h-offshoots is Su − I(h). Given a fuzzy topic set u over I, a set
of nodes
H will be referred to as a u-cover if: (a) H covers Su , that is, Su ⊆ h∈H I(h),
and (b) the nodes in H are unrelated, i.e. I(h) ∩ I(h ) = ∅ for all h, h ∈ H such
that h = h . The interior nodes of H will be referred to as head subjects and the
leaf nodes as offshoots, so the set of offshoots in H is H ∩ I. The set of gaps in
H is the union of G(h) over all head subjects h ∈ H − I.
We define the penalty function p(H) for a u-cover H as:

p(H) = u(h) + λv(g) + γu(h). (1)
h∈H−I h∈H−I g∈G(h) h∈H∩I
The problem we address is to find a u-cover H that globally minimizes the

penalty p(H). Such a u-cover is the parsimonious generalization of the set u.
Before applying an algorithm to minimize the total penalty, one needs to
execute a preliminary transformation of the tree by pruning it from all the non-
maximal u-irrelevant nodes, i.e. descendants of gaps. Simultaneously,
the sets of
gaps G(t) and the internal summary gap importance V (t) = g∈G(t) v(g) in Eq.
(1) can be computed for each interior node t. We note that the elements of Su
are in the leaf set of the pruned tree, and the other leaves of the pruned tree are
precisely the gaps. After this, our lifting algorithm ParGenFS applies. For each
node t, the algorithm ParGenFS computes two sets, H(t) and L(t), containing
those nodes in T (t) at which respectively gains and losses of head subjects occur
(including offshoots). The associated penalty p(t) is computed too.
An assumption of the algorithm is that no gain can happen after a loss.
Therefore, H(t) and L(t) are defined assuming that the head subject has not
been gained (nor therefore lost) at any of t’s ancestors. The algorithm ParGenFS
recursively computes H(t), L(t) and p(t) from the corresponding values for the
child nodes in χ(t).
Specifically, for each leaf node that is not in Su , we set both L(·) and H(·)
to be empty and the penalty to be zero. For each leaf node that is in Su , L(·) is
set to be empty, whereas H(·), to contain just the leaf node, and the penalty is
defined as its membership value multiplied by the offshoot penalty weight γ. To
compute L(t) and H(t) for any interior node t, we analyze two possible cases:
(a) when the head subject has been gained at t and (b) when the head subject
has not been gained at t. In case (a), the sets H(·) and L(·) at its children are
not needed. In this case, H(t), L(t) and p(t) are defined by:
H(t) = {t}, L(t) = G(t), p(t) = u(t) + λV (t). (2)
In case (b), the sets H(t) and L(t) are just the unions of those of its children,
and p(t) is the sum of their penalties:

H(t) = H(w), L(t) = L(w), p(t) = p(w). (3)
w∈χ(t) w∈χ(t) w∈χ(t)
To obtain a parsimonious lift, whichever case gives the smaller value of p(t)
is chosen.
When both cases give the same values for p(t), we may choose arbitrarily –
in the formulation of the algorithm below, we have chosen (a). The output of the
algorithm consists of the sets defined at the root, namely, H – the set of head
subjects and offshoots, L – the set of gaps, and p – the associated penalty.
ParGenFS Algorithm
– INPUT: u, T
– OUTPUT: H = H(root), L = L(root), p = p(root)
I Base Case
for each leaf i ∈ I
if u(i) > 0
H(i) = {i}, L(i) = , p(i) = γu(i)
else
H(i) = , L(i) = , p(i) = 0
II Recursion
if u(t) + λV (t) ≤ w∈χ(t) p(w)
H(t) = {t}, L(t) = G(t), p(t) = u(t) + λV (t)
else
H(t) = w∈χ(t) H(w), L(t) = w∈χ(t) L(w), p(t) = w∈χ(t) p(w)
The algorithm ParGenFS leads to an optimal lifting indeed:

Theorem 1. Any u-cover H found by the algorithm ParGenFS is a (global)
minimizer of the penalty p.
Proof. We prove this result by induction over the number of nodes n in the
tree. If n = 1, there is only one node i and, in the Base Case of ParGenFS, the
definition of the sets H(i) and L(i) is such that the only possible non-empty set
is H(i) = {i}, when i ∈ Su . The penalty in this case is γu(i), which is clearly
the correct, and minimum, penalty. When i ∈ / Su , the penalty is obviously zero.
Let us now assume that the statement is true for all rooted trees with fewer
than n nodes. Consider a rooted tree T (t) with n nodes, where n > 1. Each child
w of the root t is itself the root of a subtree T (w) with fewer than n nodes.
If the head subject is not gained at t, then the optimal H- and L-sets at
t are clearly the unions of the corresponding sets for the subtrees T (w); this
follows from the additive structure of the penalty function in (1). Clearly, the
minimum penalty for the subtree T (t) must be the smaller of the penalty values
p(t) = u(t)+λV (t) and p(t) = w∈χ(t) p(w), as it is in the algorithm. The result
now follows by induction on n.
3 Structuring and Generalizing a Collection of Research

Papers
To apply the ParGenFS algorithm, we follow the steps described below.
3.1 Scholarly Text Collection
We downloaded a collection of 17685 research papers together with their

abstracts published in 17 journals related to Data Science for 20 years from
1998–2017. We take the abstracts to these papers as a representative collection.
3.2 DST Taxonomy
We consider a taxonomy of Data Science, comprising such areas as machine

learning, data mining, data analysis, etc. We take that part of the ACM-CCS
2012 taxonomy, which is related to Data Science, and add a few leaves related
to more recent Data Science developments. The taxonomy under consideration
is presented, for example, in [4].
3.3 Scoring the Relevance Between Texts and Key Phrases
Most popular and well established approaches to scoring keyphrase-to-document

relevance include the so-called vector-space approach [10] and probabilistic text
model approach [2]. These, however, rely on individual words and text pre-
processing. We utilize a method, first developed by Pampapathi et al. [9] and
further advanced in [3], the AST method for evaluating keyphrase-to-text rel-
evance score using purely string frequency information. An advantage of the
method is that it requires no manual work, but works rather reliably, as claimed
by these authors.
3.4 Deriving Fuzzy Clusters of Taxonomy Topics
Clusters of topics should reflect co-occurrence of topics: the greater the number
of texts to which both t and t topics are relevant, the greater the interrela-
tion between t and t , the greater the chance for topics t and t to fall in the
same cluster. We have tried several popular clustering algorithms at our data.
Unfortunately, no satisfactory results have been found. Therefore, we present
here results obtained with the FADDIS algorithm developed in [7] specifically
for finding thematic clusters. This algorithm implements assumptions that are
relevant to the task:
LN Laplacian Normalization: Similarity data transformation, modeling – to an

extent – heat distribution and, in this way, making the cluster structure
sharper.
AA Additivity: Thematic clusters behind the texts are additive, so that simi-
larity values are sums of contributions by different hidden themes.
AN Non-Completeness: Clusters do not necessarily cover all the key phrases
available, as the text collection under consideration may be irrelevant to
some of them.
Co-Relevance Topic-to-Topic Similarity Score Given a keyphrase-to-

document matrix R of relevance scores is converted to a keyphrase-to-keyphrase
similarity matrix A for scoring the “co-relevance” of keyphrases according to the
text collection structure. The similarity score att between topics t and t can be
computed as the inner product of vectors of scores rt = (rtv ) and rt = (rt v )
where v = 1, 2, . . . , V = 17685. The inner product is moderated by a natu-
ral weighting factor assigned to texts in the collection. The weight of text v is
defined as the ratio of the number of topics nv relevant to it and nmax , the max-
imum nv over all v = 1,2,...,V. A topic is considered relevant to v if its relevance
score is greater than 0.2 (a threshold found experimentally, see [3]).
FADDIS Thematic Clusters After computing the 317 × 317 topic-to-topic

co-relevance matrix, converting in to a topic-to-topic Lapin transformed similar-
ity matrix, and applying FADDIS clustering, we sequentially obtained 6 clusters,
of which three clusters are obviously homogeneous. They relate to ‘Learning’,
‘Retrieval’, and ‘Clustering’. These clusters, L, R, and C, respectively, are pre-
sented in Table 1.
3.5 Results of Lifting Clusters L, R, and C
The clusters above are lifted in the DST taxonomy using ParGenFS algorithm
with the gap penalty λ = 0.1 and off-shoot penalty γ = 0.9 defined to correspond
specifics of the DST tree.
The results of lifting of Cluster L are shown in Fig. 4. There are three head
subjects: Machine Learning, Machine Learning Theory, and Learning to Rank.
Table 1. Clusters L, R, C: topics with largest membership values.
Cluster L Cluster R Cluster C

u(t) Code Topic u(t) Code Topic u(t) Code Topic
0.300 5.2.3.8 Rule 0.211 3.4.2.1 Query 0.327 3.2.1.4.7 Biclustering
learning representation
0.282 5.2.2.1 Batch 0.207 5.1.3.2.1 Image 0.327 3.2.1.4.7 Biclustering
learning representations
0.276 5.2.1.1.2 Learning 0.194 5.1.3.2.2 Shape 0.286 3.2.1.4.3 Fuzzy clustering
to rank representations
0.217 1.1.1.11 Query learning 0.248 3.2.1.4.2 Consensus
clustering
0.216 5.2.1.3.3 Apprenticeship 0.220 3.2.1.4.6 Conceptual
learning clustering
... ... ...
These represent the structure of the general concept “Learning” according to the
text collection under consideration. One can see from these head subjects that
main work here still concentrates on theory and method rather than applications.
Fig. 4. Lifting results for Cluster L: Learning. Gaps are numbered.
Similar comments can be made with respect to results of lifting of Cluster

R: Retrieval. The obtained head subjects: Information Systems and Computer
Vision show the structure of “Retrieval” in the set of publications under con-
siderations. We can clearly see the tendencies of the contemporary stage of the
process. Rather than relating the term “information” to texts only, as it was in
the previous stages of the process of digitalization, visuals are becoming parts
of the concept of information.
For the results of lifting of Cluster C the corresponding taxonomy fragment

is too large, whereas the lifting results are too fragmentary. 16 (!) head subjects
was obtained: clustering, graph based conceptual clustering, trajectory cluster-
ing, clustering and classification, unsupervised learning and clustering, spec-
tral methods, document filtering, language models, music retrieval, collabora-
tive search, database views, stream management, database recovery, mapreduce
languages, logic and databases, language resources. As one can see, the core clus-
tering subjects are supplemented by methods and environments in the cluster –
this shows that the ever increasing role of clustering activities perhaps should
be better reflected in the taxonomy. At the beginning of the Data Science era, a
few decades ago, clustering was usually considered a more-or-less auxiliary part
of machine learning, the unsupervised learning. Perhaps, soon we are going to
see a new taxonomy of Data Science, in which clustering is not just an auxil-
iary instrument but rather a model of empirical classification, a big part of the
knowledge engineering. When discussing the role of classification as a knowledge
engineering phenomenon, one encounters three conventional aspects of classifi-
cation: structuring the phenomena; relating different aspects of phenomena to
each other; and shaping and keeping knowledge of phenomena. Each of them
can make a separate direction of research in knowledge engineering.
References
1. The 2012 ACM Computing Classification System. http://www.acm.org/about/
class/2012. Accessed 30 Apr 2018
2. Blei, D.: Probabilistic topic models. Commun. ACM 55(4), 77–84 (2012)
3. Chernyak, E.: An approach to the problem of annotation of research publications.
In: Proceedings of the Eighth ACM International Conference on Web Search and
Data Mining, pp. 429–434. ACM (2015)
4. Frolov, D., Mirkin, B., Nascimento, S., Fenner, T.: Finding an appropriate gen-
eralization for a fuzzy thematic set in taxonomy. Working paper WP7/2018/04,
Moscow, Higher School of Economics Publ. House, 58 p. (2018)
5. Lloret, E., Boldrini, E., Vodolazova, T., MartÃnez-Barco, P., Munoz, R., Palo-
mar, M.: A novel concept-level approach for ultra-concise opinion summarization.
Expert. Syst. Appl. 42(20), 7148–7156 (2015)
6. Mei, J.P., Wang, Y., Chen, L., Miao, C.: Large scale document categorization with
fuzzy clustering. IEEE Trans. Fuzzy Syst. 25(5), 1239–1251 (2017)
7. Mirkin, B., Nascimento, S.: Additive spectral method for fuzzy cluster analysis
of similarity data including community structure and affinity matrices. Inf. Sci.
183(1), 16–34 (2012)
8. Mueller, G., Bergmann, R.: Generalization of workflows in process-oriented case-
based reasoning. In: FLAIRS Conference, pp. 391–396 (2015)
9. Pampapathi, R., Mirkin, B., Levene, M.: A suffix tree approach to anti-spam email
filtering. Mach. Learn. 65(1), 309–338 (2006)
10. Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval.
Inf. Process. Manag. 25(5), 513–523 (1998)
11. Song, Y., Liu, S., Wang, H., Wang, Z., Li, H.: Automatic taxonomy construction
from keywords. US Patent No. 9,501,569. Washington, DC, US Patent and Trade-
mark Office (2016)
12. Vedula, N., Nicholson, P.K., Ajwani, D., Dutta, S., Sala, A., Parthasarathy, S.:
Enriching taxonomies with functional domain knowledge. In: The 41st Inter-
national ACM SIGIR Conference on Research & Development in Information
Retrieval, pp. 745–754. ACM (2018)
13. Waitelonis, J., Exeler, C., Sack, H.: Linked data enabled generalized vector space
model to improve document retrieval. In: Proceedings of NLP & DBpedia 2015
Workshop in Conjunction with 14th International Semantic Web Conference
(ISWC), vol. 1486. CEUR-WS (2015)
14. Wang, C., He, X., Zhou, A.: A Short survey on taxonomy learning from text cor-
pora: issues, resources and recent advances. In: Proceedings of the 2017 Conference
on Empirical Methods in Natural Language Processing, pp. 1190–1203 (2017)
K-Medoids Clustering Is Solvable in
Polynomial Time for a 2d Pareto Front
Nicolas Dupin1(B) , Frank Nielsen2 , and El-Ghazali Talbi3

1
LRI, Université Paris-Sud, Université Paris-Saclay, Paris, France
dupin@lri.fr
2
Sony Computer Science Laboratories Inc., Tokyo, Japan
Frank.Nielsen@acm.org
3
Univ. Lille, UMR 9189 - CRIStAL - Centre de Recherche en Informatique
Signal et Automatique de Lille, 59000 Lille, France
el-ghazali.talbi@univ-lille.fr
Abstract. The k-medoids problem is a discrete sum-of-square clustering

problem, which is known to be more robust to outliers than k-means clus-
tering. As an optimization problem, k-medoids is NP-hard. This paper
examines k-medoids clustering in the case of a two-dimensional Pareto
front, as generated by bi-objective optimization approaches. A character-
ization of optimal clusters is provided in this case. This allows to solve k-
medoids to optimality in polynomial time using a dynamic programming
algorithm. More precisely, having N points to cluster, the complexity of
the algorithm is proven in O(N 3 ) time and O(N 2 ) memory space. This
algorithm can also be used to minimize conjointly the number of clus-
ters and the dissimilarity of clusters. This bi-objective extension is also
solvable to optimality in O(N 3 ) time and O(N 2 ) memory space, which
is useful to choose the appropriate number of clusters for the real-life
applications. Parallelization issues are also discussed, to speed-up the
algorithm in practice.
Keywords: Bi-objective optimization · Clustering algorithms ·

K-medoids · Euclidean sum-of-squares clustering · Pareto front ·
Dynamic programming · Bi-objective clustering
1 Introduction
This paper is motivated by real-life applications of bi-objective optimization.

Some optimization problems can be driven by more than one objective function,
with some conflicts among objectives. For example, one may minimize financial
costs, while maximizing the robustness to uncertainties [4,16]. In such cases,
higher levels of robustness are likely to induce financial over-costs. Pareto dom-
inance, preferring a solution from another if it is better for all the objectives,
is a weak dominance rule. With conflicting objectives, several non-dominated
solutions can be generated, these efficient solutions are the best compromises.
https://doi.org/10.1007/978-3-030-21803-4_79
K-Medoids Clustering Is Solvable in Polynomial Time 791
A Pareto front is the projection in the objective space of the non-dominated

solutions [7].
Bi-objective optimization approaches may generate large Pareto fronts, for a
trade-off evaluation by a decision maker. The problem is here to select K good
compromise solutions from N K non dominated solutions while maximizing
the representativity of these K solutions. This problem can be seen as an appli-
cation of clustering algorithms, partitioning the N elements into K subsets with
a maximal similarity, and giving a representative element of the optimal clus-
ters. Selecting best compromise solutions for human decision makers, one deals
with small values K < 10. We note that partial Pareto fronts are used inside
population meta-heuristics [21]. Clustering a Pareto front is also useful in this
context to archive representative solutions of the partial Pareto front [23]. For
such applications, the values of K are larger than the previous ones.
k-means clustering is one of the most famous unsupervised learning problem,
and is widely studied in the literature since the seminal algorithm provided
by Lloyd in [13]. The k-medoids problem, the discrete variant of the k-means
problem, fits better with our application to maximize the dissimilarity around a
representative solution [11]. If k-medoids clustering is more combinatorial than
k-means clustering, it is known to be more robust on noises and outliers [10].
Both k-medoids and k-means problem are NP hard in the general and the planar
case, we refer to [1] for k-means and [9] for k-medoids.
Lloyd’s algorithm can be extended to solve heuristically k-medoids prob-
lems. PAM (Partitioning Around Medoids), CLARA (Clustering LARge Appli-
cations) and CLARANS (Clustering Large Applications based upon RANdom-
ized Search) are such heuristics for k-medoids clustering [19]. Hybrid and genetic
algorithms were also investigated in [20]. Previous heuristics converge only to
local minima, without any guarantee to reach a global minimum.
This paper proves that the special case of k-medoids clustering in a two
dimensional (2-d) Pareto front is solvable to optimality in polynomial time, using
a dynamic programming (DP) algorithm. We note similarities with other works.
k-center and k-median problems are also solvable in polynomial time thanks
to a DP algorithm in [5]. k-means clustering in one dimension, which can be
projected as an affine Pareto front in dimension 2, is also polynomial with DP
[8]. A DP heuristic applies k-means in 2-d Pareto fronts [6]. Lastly, there are
similarities to maximize the quality of discrete representations of Pareto sets
with the hypervolume measure, giving rise to the Hypervolume Subset Selection
(HSS). HSS is solvable in polynomial time for 2-d Pareto fronts thanks to a DP
algorithm presented in [2].
In Sect. 2, we define formally the problem and fix the notation. In Sect. 3,
intermediate results and a characterization of optimal clusters are presented.
In Sect. 4, it is described how to compute efficiently the costs of the previous
clusters. In Sect. 5, a DP algorithm is presented with a proven polynomial com-
plexity. In Sect. 6, it is discussed how to choose the value K using properties of
the DP resolution. In Sect. 7, our contributions are summarized, discussing also
future directions of research.
792 N. Dupin et al.
2 Problem Statement and Notation

A 2-d discrete Pareto front can be defined as a set E = {x1 , . . . , xN } of N
elements of R2 , such that for all i = j, xi I xj defining the binary relations
I, ≺ for all y = (y 1 , y 2 ), z = (z 1 , z 2 ) ∈ R2 with:
y ≺ z ⇐⇒ y 1 < z 1 and y 2 > z 2 (1)
y z ⇐⇒ y ≺ z or y = z (2)
y I z ⇐⇒ y ≺ z or z ≺ y (3)
We note that the convention leading to the definitions of I, ≺ considered the
minimization of two objectives, which is not a loss of generality. Such a set E can
be extracted from any subset of R2 using an output-sensitive algorithm [14], or
generated by bi-objective optimization approaches, exact methods [7] and also
meta-heuristics [21]. We consider in this paper the Euclidian distance :

2 2
d(y, z) = ||y − z|| = (y 1 − z 1 ) + (y 2 − z 2 ) , ∀y = (y 1 , y 2 ), (z 1 , z 2 ) ∈ R2 (4)
ΠK (E) denotes the set of the possible partitions of E in K subsets:
⎧ ⎫
⎨ ⎬

ΠK (E) = P ⊂ P(E) ∀p, p ∈ P, p ∩ p = ∅ , p = E and |P | = K (5)
⎩ ⎭
p∈P
k-medoids clustering is a combinatorial optimization problem indexed by

ΠK (E). It minimizes the sum for all the K clusters of the dissimilarity measure
2
fmdd (P ) = minc∈P x∈P ||x − c|| , i.e. fmdd minimizes the sum of the squared
distances from one chosen point of P , the medoid, to the other points of P :

min fmdd (P ) (6)
π∈ΠK (E)
P ∈π
3 Optimal Property of Interval Clustering

In this section, optimal clusters are characterized for k-medoids clustering in a 2d
Pareto Front. Intermediate lemmas with elementary proofs are first mentioned.
Lemmas 1, 2 and 3 are proven in [5].
Lemma 1. is an order relation, and ≺ is a transitive relation:
∀x, y, z ∈ R2 , x ≺ y and y ≺ z =⇒ x ≺ z (7)
Lemma 2 (Total order). Points (xi ) can be indexed such that:
∀(i1 , i2 ) ∈ [[1; N ]]2 , i1 < i2 =⇒ xi1 ≺ xi2 (8)
∀(i1 , i2 ) ∈ [[1; N ]]2 , i1 i2 =⇒ xi1 xi2 (9)
This property is stronger than the property that induces a total order in E.
Furthermore, the complexity of the sorting reindexation is in O(N. log N ).
Obj2
x1
•
x2
•
x3 x4
• • x5
• x6 x
• •7 x8
•
x9 x10
• • x11 x12
• • x13
• x14
• x15
•
Obj1
Fig. 1. Illustration of a 2-dimensional Pareto front with 15 points, minimizing two

objectives and indexing the points with the Lemma 2
Lemma 3. Let x1 , x2 , x3 ∈ R2 .
x1 ≺ x2 ≺ x3 =⇒ d(x1 , x2 ) < d(x1 , x3 ) (10)
Lemma 4. Let C1 , . . . , CK an optimal partition with k-medoids clustering, we

denote with c1 , . . . , cK the medoids. Let xn ∈ E, we denote with Cj the cluster
where it is assigned in the optimal clustering. Then, we have:
∀k ∈ [[1, K]], d(xn , cj ) d(xn , ck ) (11)
Proof. Ad absurdum, an elementary proof constructs easily a better partition

if an equation (11) is not fulfilled. This result is well known, Lloyd’s algorithm
applied to k-medoids is also a hill-climbing heuristic. Hence, local minima fulfill
(11), which is a more general property.
Proposition 1 (Optimal interval clustering) We suppose that points (xi )

are sorted following Lemma 2. The optimal solutions of the minimization problem
(6) use only clusters Ci,i = {xj }j∈[[i,i ]] = {x ∈ E | ∃j ∈ [[i, i ]], x = xj }.
Proof. We prove the result by induction on K ∈ N. For K = 1, the optimal

cluster is E = {xj }j∈[[1,N ]] . Suppose K > 1 and the Induction Hypothesis (IH)
that Proposition 1 is true for (K − 1)-medoids clustering. We suppose having
an optimal clustering partition, We denote with C the cluster of xN , and xc the
medoid of C. Let A = {i ∈ [[1, N ]] | ∀k ∈ [[i, N ]], xk ∈ C}. A is a subset of N, non
empty as N ∈ A, it has a minimum. Let j = min{i ∈ [[1, N ]]|∀k ∈ [[i, N ]], xk ∈ C}.
If j = 1, E = C = {xj }j∈[[1,N ]] and the result is proven. We suppose now j > 1.
j−1 ∈ / A like j − 1 ∈ A is in a contradiction with j = min A. We denote with
C = C the cluster of j − 1, and xc the center of C . Necessarily c < j.
We prove by contradiction that j −1 < c, supposing c < j. It implies c < j −1
as xj−1 ∈ / C, and xc ≺ xj with Lemma 3. Lemma 4 implies d(xc , xj−1 )
d(xc , xj−1 ) applied to xj−1 and C . With Lemma 3, this is possible only if c < c .
794 N. Dupin et al.
We would have thus with Lemma 2 xc ≺ xc xj−1 ≺ xj . Applying Lemma

3, d(xc , xj ) < d(xc , xj ) is a contradiction with Lemma 4 applied to xj and C. It
proves j − 1 < c and Lemma 2 ensures that xc xj−1 ≺ xj xc .
We prove now that for all j j − 1, xj ∈ / C. Let j < j − 1.

If j c < c, Lemma 3 with xj xc ≺ xc implies directly d(xc , xj ) <

d(xc , xj ) and Lemma 4 implies xj ∈ C.

If j > c , we have xc ≺ xj ≺ xj−1 ≺ xc . d(xc , xj−1 ) d(xc , xj−1 ) with
Lemma 4 applied to xj−1 and C . Lemma 3 implies d(xc , xj ) < d(xc , xj−1 ) and
d(xc , xj−1 ) < d(xc , xj ). d(xc , xj ) < d(xc , xj−1 ) < d(xc , xj−1 ) < d(xc , xj ).
Hence, d(xc , xj ) < d(xc , xj ) and xj ∈/ C with Lemma 4.
This induces that for all j < j, xj ∈ / C. On one hand, it implies that
C = {xl }l∈[[j,N ]] . On the other hand, the other clusters are optimal for E = E −C
with (K −1)-medoids clustering. Applying IH to the K −1 clustering to E proves
that the optimal clusters are on the shape Ci,i = {xj }j∈[i,i ]] .
4 Computing the Costs of Interval Clustering

We define ci,i as the cost of cluster Ci,i for the k-medoids clustering. This section
aims to compute efficiently the costs ci,i for all i < i . By definition:

∀i < i , ci,i = fmdd (Ci,i ) = min
2
||xj − xk || (12)
j∈[[i,i ]]
k∈[[i,i ]]
The naive computation of ci,i has a time complexity in O((i − i)2 ). Computing
ci,i independently for all i < i , this induces a complexity in O(N 4 ) time and
in O(N 2 ) memory space. To improve the complexity, Algorithm 1 computes for
all i < c < i the values di,c,i , defined as the costs of k-medoids for cluster Ci,i
with center c, using the induction relations (14) to compute each element di,c,i
in O(1). The matrix ci,i is then computed from di,c,i using (15):

i
2
∀i c i , di,c,i = ||xk − xc || (13)
k=i
∀i c i < N,
2
di,c,i +1 = di,c,i + ||xi +1 − xc || (14)
∀i i , ci,i = min di,l,i (15)
l∈[[i,i ]]
Proposition 2 Using Algorithm 1, computing the matrix of costs ci,i for all
i < i has a complexity in O(N 3 ) time and in O(N 2 ) memory space.
Proof. The induction formula (14) uses only values di,k,i+l with l < l. In Algo-
rithm 1, it is easy to show by induction that di,k,i+l , and also ci,i+l , has its final
value for all l ∈ [[1, N ]] at the end of the for loops from k = i to i + l.
Let us analyze the complexity. The space complexity is defined by the sizes
of matrices ci,i and di,c,i . A priori, the space complexity is in O(N 3 ), given
Algorithm 1: Computation of matrix ci,i for the k-medoids problem

define matrix c with ci,i = 0 for all (i, i ) ∈ [[1; N ]]2 with i i
define matrix d with di,l,i = 0 for all (i, l, i ) ∈ [[1; N ]]3 with i l i
for l = 1 to N //consider subset of cardinal l
for i = 1 to N − l
for k = i to i + l
di,k,i+l = di,k,i+l−1 + ||xi+l − xk ||2
end for
Compute ci,i+l = mink∈[[i,i+l]] di,k,i+l
end for
end for
return matrix ci,i
by di,c,i . Actually, for a given i, i + l, the computations of ci,i+l in the inner

loop requires only in memory di,c,i+l and di,c,i+l−1 . Values dj,c,j+m for all j and
m < l − 1 can be deleted. The order of computations increasing l in the first
loop allows to consider only few variables dj,c,j+m , in the order of 2 × l 2N
to compute ci,i+l . Hence, the space complexity is in O(N 2 ) memory space.
2
Let α the time to compute di,k,i+l = di,k,i+l−1 + ||xi+l − xk || . Defining β
as the time to compute an operation like min(di,k,i , di,k+1,i ) and to store the
result, the time to compute ci,i+l = mink∈[[i,i+l]] di,k,i+l is a β.l.
−l
−l

N N
i+l
N N
TN = β.l + α = (β.l + (l + 1)α) = O(N 3 )
l=1 i=1 k=i l=1 i=1
To speed-up the computation of Algorithm 1, a parallel implementation is

possible. Once value l = i − i is chosen, the second loop considers independent
computations using results with a lower values l. This loop can be parallelized
using OpenMP in a sharing-memory environment, synchronizing only when l
is increased. The third loop can also be parallelized, a common parallelization
with the second loop induces a better load balancing. A parallel implementation
can also be designed in a distributed environment with many processors using
Message Passing Interface (MPI, [15]). Partitioning [[1, N ]] in different subsets
[[i, i ]] allows each processor to compute locally the values cj,j for i j j i .
With neighbor processes having the neighbor sub-intervals, communications are
easy to merge sub-problems. In these parallel schemes, load balancing issues are
crucial using appropriate decomposition strategies [18].
5 Dynamic Programming Algorithm
Let ci,i for i < i the elementary costs computed with the Algorithm 1. We define
Ci,k as the optimal cost of the k-medoids clustering with k cluster among points
[[1, i]] for all i ∈ [[1, N ]] and k ∈ [[1, K]], we have following induction relation:
796 N. Dupin et al.
∀i ∈ [[1, N ]], ∀k ∈ [[2, K]], Ci,k = min Cj−1,k−1 + cj,i (16)

j∈[[1,i]]
with the convention C0,k = 0 for all k 0. The case k = 1 is directly given by:
∀i ∈ [[1, N ]], Ci,1 = c1,i (17)
These relations allow to compute the optimal values of Ci,k by dynamic pro-
gramming in the Algorithm 2. CN,K is the optimal solution of the k-medoids
problem, backtracking on the matrix (Ci,k )i,k computes the optimal partitioning
clusters.
Algorithm 2: k-medoids clustering in a 2d-Pareto Front

initialize matrix c with ci,j = 0 for all (i, j) ∈ [[1; N ]]2
initialize matrix C with Ci,k = 0 for all i ∈ [[0; N ]], k ∈ [[1; K]]
initialize P =nil, a set of sub-intervals of [[1; N ]].
sort E following the order of Lemma 2
compute ci,j for all (i, j) ∈ [[1; N ]]2 with Algorithm 1
for i = 1 to N //Construction of the matrix C

set Ci,1 = c1,i
for k = 2 to K
set Ci,k = minj∈[[1,i]] Cj−1,k−1 + cj,i
end for
end for
i=N //Backtrack phase

for k = K to 1 with increment k ← k − 1
find j ∈ [[1, i]] such that Ci,k = Cj−1,k−1 + cj,i
add [[j, i]] in P
i=j−1
end for
return CN,K the optimal cost and the partition P giving the cost CN,K
Theorem 1. Let E = {x1 , . . . , xN } a subset of N points of R2 , such that for

all i = j, xi I xj . Clustering E with k-medoids is solvable to optimality in
polynomial time with Algorithm 2, with a complexity in O(N 3 ) time and in
O(N 2 ) memory space.
Proof. The formula (16) uses only values Ci,j with j < k in Algorithm 2. Induc-
tion proves that Ci,k has its final value for all i ∈ [[1, N ]] at the end of the for
loops from k = 2 to K. CN,k is thus at the end of these loops the optimal
value of the k-medoids clustering among the N points of E. The backtracking
phase searches for the equalities in Ci,k = Cj −1,k−1 + cj ,i to return the optimal
clusters Cj ,i . Let us analyze the complexity. Sorting and indexing the elements
of E following Lemma 2 has a complexity in O(N log N ). The computation of
the matrix ci,i has a complexity in O(N 3 ) thanks to Algorithm 1 and Propo-
sition 2. The construction of the matrix Ci,k requires N × K computations of
minj∈[[1,i]] Cj−1,k−1 + cj,i , which are in O(N ), the complexity of this phase is in
O(K.N 2 ). The backtracking phase requires K computations having a complex-
ity in O(N ), the complexity is in O(K.N ). The bottleneck is the computation
of the matrix ci,i as K < N , Hence, the complexity of Algorithm 1 is in O(N 3 )
time and in O(N 2 ) memory space because of the initialization phase.
We note that a similar DP algorithm solves HSS and 1-d k-means, with
a similar complexity in O(K.N 2 ) time and O(K.N ) memory space in [2,22].
In both cases, the time complexity of the DP was improved: time complexity in
O(KN ) using memory space in O(N ) for 1-d k-means in [8], and time complexity
for HSS in O(K.N + N log N ) since [3] and in O(K.(N − K) + N log N ) since
[12]. Similarly, perspectives are to speed-up the construction of the matrix Ci,k .
However, the complexity in Algorithm 2 is mainly due to the initialization, it
reduces the impact of such speed-up.
The practical efficiency can also be improved with a parallel implementation.
The parallelization of Algorithm 1, as previously described, is crucial: the ini-
tialization phase is indeed the bottleneck for the complexity. The backtracking
phase is sequential, but it has the lowest complexity. The second phase, con-
structing matrix C can also be parallelized. Indeed, once Ci,k are computed for
all i N and a given k, it allows to compute independently all the Ci,k+1 for
i N . A construction of the matrix C, line by line with k increasing, allows
to have K iterations with N independent computations that can be distributed
in several processors. This parallel scheme is straightforward to implement in a
shared memory environment with OpenMP. For a distributed implementation
with MPI, Ci,k computations for all i N require only Ci,k−1 in memory for
all i N . At most one N-dimensional array must be stored for each processor,
MPI AllGather operations at each iteration k allows to distribute the results
that are required for the next iteration.
6 Bi-Objective Clustering, How to Choose K?

A crucial point for the real life application of clustering is to select an appropriate
value of K. A too small value of K can miss that an instance is well-captured
with K + 1 representative clusters, whereas a high value of K gives redundant
representatives. The real-life application seeks for the best compromise between
the minimization of K, and the minimization of the dissimilarity among the
clusters. If for a fixed value of K, K-medoids clustering is solvable in polynomial
time, the question is here to analyze the complexity of the bi-objective extension.
Actually, the bi-objective extension has the same complexity as the K-medoids
clustering problem for a fixed value of K:
Theorem 2. Let E = {x1 , . . . , xN } a subset of N points of R2 , such that for all
i = j, xi I xj . The bi-objective clustering extension, minimizing conjointly the
number of clusters k and the k-medoids measure is also solvable to optimality in
polynomial time with a complexity in O(N 3 ) time and in O(N 2 ) memory space.
798 N. Dupin et al.
Proof. Using Algorithm 2 with K = N , The computation has a complexity of

O(N 3 ), complexity of both the initialization of matrix c and the construction of
matrix C. The Pareto front of the bi-objective optimization is included in the
unions of {(k, Ck,N )}, similarly with the computation of Pareto fronts with the
ε-constraint method. Removing the dominated points has a complexity in O(N ),
negligible compared to O(N 3 ). This computes the Pareto front, solution of the
bi-objective minimizing conjointly the number of clusters and the dissimilarity
of clusters with k-medoids. Furthermore, the optimal clusters related to points
{(k, Ck,N )}k can be computed with backtracking operations from Ck,N like in
the Algorithm 2. Each point computation requiring O(N 2 ) operations, all the
solutions can be computed in O(N 3 ).
Theorem 2 allows to compute the whole Pareto front {(k, Ck,N )}k with the
same complexity than only one point of this Pareto front. Searching for good
values of k, the elbow technique, graph test, gap test as described in [17], apply
to select a good value of k from the Pareto front of couples {(k, Ck,N )}k .

This paper examined properties of the k-medoids problem in the special case of
a discrete set of non-dominated points in a two dimensional Euclidian space. A
characterization of optimal clusters is proven with interval clustering. It allows
to solve k-medoids to optimality with a dynamic programming algorithm for
this special case. A polynomial complexity is proven in O(N 3 ) time and O(N 2 )
memory space. The bi-objective extension, minimizing conjointly the number of
clusters and the dissimilarity of clusters, is also solvable to optimality in O(N 3 )
and O(N 2 ) memory space, which is useful to define how many clusters to use
for the real-life application. Parallelization issues are discussed, to speed-up the
algorithms in practice.
The complexity in O(N 3 ) may be a bottleneck to deal with very large Pareto
fronts, which open new perspectives. A first perspective is to accelerate the
convergence of the exact algorithm, similarly with [8,12]. A second perspective
is to derive specific heuristics to seek for good quality solutions in short resolution
time. For both previous issues, parallelization is a complementary perspective to
speed-up the resolution of k-medoids clustering for large Pareto fronts.
References
1. Aloise, D., Deshpande, A., Hansen, P., Popat, P.: NP-hardness of Euclidean sum-
of-squares clustering. Mach. Learn. 75(2), 245–248 (2009)
2. Auger, A., Bader, J., Brockhoff, D., Zitzler, E.: Investigating and exploiting the
bias of the weighted hypervolume to articulate user preferences. In: Proceedings of
GECCO 2009, pp. 563–570. ACM (2009)
3. Bringmann, K., Friedrich, T., Klitzke, P.: Two-dimensional subset selection for
hypervolume and epsilon-indicator. In: Annual Conference on Genetic and Evolu-
tionary Computation, pp. 589–596. ACM (2014)
4. Dupin, N.: Modélisation et résolution de grands problèmes stochastiques combina-

toires: application à la gestion de production d’électricité. Ph.D. thesis, University
Lille 1 (2015)
5. Dupin, N., Nielsen, F., Talbi, E.: Clustering in a 2d pareto front: p-median and
p-center are solvable in polynomial time, pp. 1–24 (2018). arXiv:1806.02098
6. Dupin, N., Nielsen, F., Talbi, E.: Dynamic programming heuristic for k-means
clustering among a 2-dimensional pareto frontier. In: 7th International Conference
on Metaheuristics and Nature Inspired Computing, pp. 1–8 (2018)
7. Ehrgott, M., Gandibleux, X.: Multiobjective combinatorial optimization-theory,
methodology, and applications. In: Multiple Criteria Optimization: State of the
Art Annotated Bibliographic Surveys, pp. 369–444. Springer (2003)
8. Grønlund, A., Larsen, K.G., Mathiasen, A., Nielsen, J.S., Schneider, S., Song, M.:
Fast exact k-means, k-medians and bregman divergence clustering in 1d (2017).
arXiv preprint arXiv:1701.07204
9. Hsu, W., Nemhauser, G.: Easy and hard bottleneck location problems. Discret.
Appl. Math. 1(3), 209–215 (1979)
10. Jain, A.: Data clustering: 50 years beyond k-means. Pattern Recognit. Lett. 31(8),
651–666 (2010)
11. Kaufman, L., Rousseeuw, P.: Clustering by Means of Medoids (1987)
12. Kuhn, T., Fonseca, C.M., Paquete, L., Ruzika, S., Duarte, M.M., Figueira, J.R.:
Hypervolume subset selection in two dimensions: formulations and algorithms.
Evol. Comput. 24(3), 411–425 (2016)
13. Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2),
129–137 (1982)
14. Nielsen, F.: Output-sensitive peeling of convex and maximal layers. Inf. Process.
Lett. 59(5), 255–259 (1996)
15. Nielsen, F.: Introduction to HPC with MPI for Data Science. Springer (2016)
16. Peugeot, T., Dupin, N., Sembely, M.J., Dubecq, C.: MBSE, PLM, MIP and robust
optimization for system of systems management, application to SCCOA French air
defense program. In: Complex Systems Design & Management, pp. 29–40. Springer
(2017)
17. Rasson, J.P., Kubushishi, T.: The gap test: an optimal method for determining the
number of natural classes in cluster analysis. In: New Approaches in Classification
and Data Analysis, pp. 186–193. Springer (1994)
18. Saule, E., Baş, E., Çatalyürek, Ü.: Load-balancing spatially located computations
using rectangular partitions. J. Parallel Distrib. Comput. 72(10), 1201–1214 (2012)
19. Schubert, E., Rousseeuw, P.: Faster k-Medoids clustering: improving the PAM,
CLARA, and CLARANS algorithms (2018). arXiv preprint arXiv:1810.05691
20. Sheng, W., Liu, X.: A genetic k-medoids clustering algorithm. J. Heuristics 12(6),
447–466 (2006)
21. Talbi, E.: Metaheuristics: From Design to Implementation. Wiley (2009)
22. Wang, H., Song, M.: Ckmeans. 1d. dp: optimal k-means clustering in one dimension
by dynamic programming. The R J. 3(2), 29 (2011)
23. Zio, E., Bazzo, R.: A clustering procedure for reducing the number of representative
solutions in the Pareto Front of multiobjective optimization problems. Eur. J. Oper.
Res. 210(3), 624–634 (2011)
Learning Sparse Neural Networks via 0
and T1 by a Relaxed Variable Splitting
Method with Application to Multi-scale
Curve Classification
Fanghui Xue and Jack Xin(B)
Department of Mathematics, UC Irvine, Irvine, CA 92697, USA

{fanghuix,jack.xin}@uci.edu
Abstract. We study sparsification of convolutional neural networks

(CNN) by a relaxed variable splitting method of 0 and transformed-1
(T1 ) penalties, with application to complex curves such as texts writ-
ten in different fonts, and words written with trembling hands simulating
those of Parkinson’s disease patients. The CNN contains 3 convolutional
layers, each followed by a maximum pooling, and finally a fully connected
layer which contains the largest number of network weights. With 0
penalty, we achieved over 99% test accuracy in distinguishing shaky vs.
regular fonts or hand writings with above 86% of the weights in the fully
connected layer being zero. Comparable sparsity and test accuracy are
also reached with a proper choice of T1 penalty.
Keywords: Convolutional neural network · Sparsification ·

Multi-scale curves · Classification
1 Introduction
Sparsification of neural networks is one of the effective complexity reduction
methods to improve efficiency and generalizability [3,4]. In this paper, we spar-
sify convolutional neural networks (CNN) for classifying curves with multi-scale
structures. Such curves arise in hand writings of people with neurological dis-
orders e.g. Parkinson disease (PD) patients, and in neuropsychological exams.
Distinguishing hand writings of normal and PD subjects computationally will
greatly help diagnosis and reduce physicians’ workload in evaluations.
People with PD tend to lose control of their hands, and their writing or
drawing shows oscillatory behavior as shown in Fig. 2, a century old image avail-
able online. Such oscillatory features can be learned during CNN training. Since
we do not have large amount of PD hand writings, we shall generate on the
computer a large number of oscillatory shapes that mimic shaky writings of PD
subjects. Indeed, we found that CNN is quite successful for this task and can
reach accuracy as high as 99% on our synthetic data set with three convolution
layers and one fully connected layer as shown in Fig. 1. However, we also found
https://doi.org/10.1007/978-3-030-21803-4_80
Learning Sparse Neural Networks 801
that there is a lot of redundancy in the weights of the trained CNNs, especially
in the fully connected layer where we aim to significantly sparsify the network
weights with minimal loss of accuracy.
Since the natural sparsity promoting penalty 0 is discontinuous, we shall
adopt the relaxed variable splitting method (RVSM, [3]) for network sparsifica-
tion. Even though Lipschitz continuous penalties such as 1 and transformed-
1 [2,8] are almost everywhere differentiable, the splitting approach [3] is more
effective for enforcing sparsity than directly placing a penalty function inside the
stochastic gradient descent (SGD) algorithm. The RVSM is also much simpler
than the statistical 0 regularization approach in [4]. A systematic comparison
with [4] will be conducted elsewhere.
The rest of the paper is organized as follows. In Sect. 2, we review RVSM for
0 , transformed-1 , and 1 penalties and present a convergence theorem. A new
critical point condition is introduced for the limit. We apply RVSM to CNNs
for multi-scale curve classification. In Sect. 3, we describe our data set, CNN
architecture and training, the CNN performance in terms of network accuracy
and sparsity. We compare weight distributions of sparse and non-sparse networks.
Concluding remarks are in Sect. 4.
2 Sparse Neural Network Training Algorithm

When training neural networks, one minimizes a penalized objective function of
the form:
l(w) := f (w) + λ P (w),
where f (w) is a standard loss function in neural network models such as cross
entropy [7], and P (w) is a penalty function. In SGD, the expected loss f is
replaced by an empirical loss over batches of training samples [7]. In this section,
we shall consider the expected loss function f which has better regularity than
the empirical loss functions [6], and is more conducive to analysis. In the actual
training, SGD and the sample averaged empirical loss function will be imple-
mented. The standard penalty is 2 norm, also known as weight decay. However,
2 penalty cannot reduce the number of redundant parameters, resulting in a
network with on the order of millions of nonzero weights. Thus we turn to 0
penalty, which produces zero weights during training [4], however leads to a
non-convex discontinuous optimization problem. In [4], a statistical approach is
proposed to regularized 0 . In this paper, we utilize the Relaxed Variable Split-
ting Method (RSVM) studied in [3] for a neural network regression problem.
RSVM is much simpler to state and implement than [4]. To this end, let us
consider the following objective function for parameter β > 0:
β
Lβ (u, w) = f (w) + λ P (u) + w − u22 .
2
Let η be the learning rate. We minimize Lβ (u, w) with the RVSM algorithm
below where the u step is thresholding and the w step is gradient descent followed
by a normalization:
802 F. Xue and J. Xin
Algorithm 1. RVSM
Initialize u0 , w0 randomly.
while not converged do
ut+1 ← arg minu Lβ (u, wt )
ŵt+1 ← wt − η∇f (wt ) − ηβ(wt − ut+1 )
t+1
wt+1 ← ŵ ŵt+1
end
The main theorem of [3] guarantees the convergence of RVSM algorithm

under some conditions on the parameters (λ, β, η) and initial weights in case of
one convolution layer network and Gaussian input data. The latter conditions
are used to prove that the loss function f has Lipschitz gradient away from the
origin. Assuming that the Lipschitz gradient condition holds for f , we adapt the
main result of [3] into:
Theorem 1. Suppose that f is bounded from below, and satisfies the Lips-
chitz gradient inequalities: ∇f (x) − ∇f (y) ≤ L1 x − y, and |f (x) − f (y) −
∇f (x), x − y| ≤ L2 x − y2 , ∀(x, y) with x ≥ δ0 , y ≥ δ0 for some
positive constants δ0 , L1 , and L2 . Then there exists a positive constant η0 =
η0 (δ0 , L1 , L2 , β) ∈ (0, 1) so that if η < η0 , the Lagrangian function Lβ (ut , wt )
is descending and converging in t, with (ut , wt ) of RVSM algorithm satisfying
(ut+1 , wt+1 ) − (ut , wt ) → 0 as t → +∞, and subsequentially approaching a
limit point (ū, w̄).
For the 0 penalty, our objective function (the Lagrangian) becomes
β
Lβ (u, w) = f (w) + λu0 + w − u22 .
2
In this case, we simply obtain
ut+1 = arg min Lβ (u, wt ) = Hλ/β (wt ),

u
where Hγ is the hard-thresholding operator [1]. On each component

√
0 if |wi | ≤ 2γ
Hγ (wi ) = √ (1)
wi if |wi | > 2γ.
For the 1 case, it is also clear that
ut+1 = Sλ/β (wt ),
where Sγ is the soft-thresholding operator [2]

⎧
⎪
⎨ wi + γ if wi ≤ −γ
Sγ (wi ) = 0 if |wi | < γ (2)
⎪
⎩
wi − γ if wi ≥ γ.
We also consider the transformed 1 (TL1) penalty [8], which nicely interpolates
the 0 and 1 penalties:
(a + 1)|x|
ρa (x) =
a + |x|
to each component of a vector, where a is a positive parameter. It is clear that
lim ρa (x) = I{x=0} , lim ρa (x) = |x|.

a→0+ a→+∞
By solving the problem with TL1 penalty, we can also get a thresholding operator
Ta,γ in closed form [8]:

0 if |wi | ≤ t
Ta,γ (wi ) = (3)
ga,γ (wi ) if |wi | > t,
where

2 φ(x) 2a |x|
ga,γ (x) = sgn(x) (a + |x|) cos − +
3 3 3 3

and φ(x) = arccos 1 − 27γa(a+1)
2(a+|x|)3 . Here the parameter t depends on γ as:

a2
γ a+1
a if γ ≤ 2(a+1)
t= a2
(4)
2γ(a + 1) − a2 if γ > 2(a+1) .
Remark 1. It follows from the Theorem above that the limit point (ū, w̄) satisfies
the equilibrium equations for the 0 , 1 and transformed-1 penalties respectively
as:
ū = Hλ/β (w̄), or Sλ/β (w̄), or Ta,λ/β (w̄);

∇f (w̄) = −β (w̄ − ū). (5)
The system (5) serves as a novel “critical point condition”. This is particularly
useful in the 0 case where the Lagrangian function Lβ (u, w) is discontinuous in
u.
We apply the RVSM algorithm to convolutional neural networks to see how
it brings about a sparse network. After training, the w̄ is sparse with small
components removed, and it serves as the network weight for inference. In the
following experiment, we consider a convolutional neural network of 3 layers and
a data set of 100 × 100 binary images. What we care about is the percentage
of the weights which are zero after training the sparse network. Many of the
algorithms can result in a sparsity of over 90%, which means only less than
Fig. 1. CNN architecture in this study.
10% of the parameters contribute to the model. This makes our model far more
efficient than the original one without regularization.
In order to find out how the weights are distributed in each layer, we go
through the structure of the network. Figure 1 shows the number of nodes in
each layer, from which we can simply calculate the number of weights needed
to connect the nodes.1 We apply 32 3 × 3 filters to the initial image to get the
first convolutional layer, which results in 32 × 3 × 3 = 288 weights. Similarly,
each of the second and the third convolutional layer contains 32 × 32 × 3 ×
3 = 9216 weights, if we apply 32 3 × 3 filters again. After each convolutional
layer, we add one max pooling layer with a 2 × 2 filter and a stride of 2. The
dimension of each image is not changed after each convolution, since we have
applied padding. But it is reduced by a half on both the width and the height
after max pooling because of a stride of 2. Thus the dimension of the image
is reduced from 100 × 100 to 50 × 50, to 25 × 25 and finally to 13 × 13. So
this produces 13 × 13 × 32 × 128 = 692224 weights when constructing a dense
layer of 128 nodes. Finally, 128 × 2 = 256 weights are used to connect the dense
layer to the output layer of 2 nodes, if our goal is to classify the images into
two categories. From the above discussion, we notice that 97.3% of the weights
are concentrated to the dense layer. We will see that most of them contribute
nothing to the model after we train the sparse network.
The first data set we use is the images of the handwritten alphabet by Parkin-
son’s disease (PD) patients and normal handwritten alphabet. We know that
many PD patients may suffer from tremors in their daily life and work. One
remarkable feature is that the words they write can be much shakier than the
normal, which can be used to distinguish a PD patient during diagnosis. Figure 22
shows one real example of handwritten sentence by a PD patient.
1
When generating the figure, we used a tool by Alex Lenail available at http://
alexlenail.me/NN-SVG/LeNet.html.
2
https://en.wikipedia.org/wiki/Micrographia (handwriting).
Fig. 2. Handwritten sentence by a PD patient.
From our point of view, these two writing styles – normal vs. shaky – can
be treated as two fonts. There is one Parkinson’s font available on the internet,3
which contains the whole alphabet of the 52 uppercase and lowercase letters. We
simulate a training set of 5,000 observations and a test set of 1,000 observations
by adding some rotations, affine transformations and elastic distortions [5]. As
we have mentioned, this is a data set of 100 × 100 binary images, of which some
samples are shown in Fig. 3. Though our model is used to distinguish the letters
written by a Parkinson’s disease patient in this single experiment, it can be
simply applied to classify any other fonts.
Fig. 3. Sample images of PD patients’ handwriting.
As most of the redundancy appears in the dense layer, we apply the threshold
step of the algorithm to the weights in dense layer only. This is because if we
use the same λ and β in all the layers, the proportion of zero weights in the
convolutional layers might be high, where the zero weights can indeed grade
the model. Compared to the dense layer of 700,000 weights, there is not much
freedom to modify the convolutional layer of 10,000 weights. Too much sparsity
leads to a sizable loss of accuracy.
In our models, we have the freedom to set the thresholding parameters,
namely β, λ and a. A higher threshold usually means more sparsity, since more
weights are forced to zero by the threshold. From the formula (1) and (2) for the
0 and 1 penalties, it is clear that the larger λ is and the smaller β is, the higher
the threshold γ will be. Given the same thresholding parameter γ, the 0 model
may result in a sparser model than 1 , since its threshold is a square root of γ,
3
https://www.dafont.com/parkinsons.font.
which is higher. From the formula (3) and (4) for the TL1 penalty, the smaller a
is, the higher the threshold is. As discussed in the previous section, when a goes
to infinity, TL1 becomes 1 . When a goes to 0, it becomes 0 . So as to achieve
more sparsity, we may choose a small a.
Our algorithm converges quickly after a few iterations. In most of the cases, it
obtains an accuracy of 95% and a sparsity of 60% after 10 epochs. The accuracy
soon goes up to 98% within 20 epochs, while some models achieve a sparsity of
around 90% eventually. Figure 4 shows the convergence of the training algorithm.
Fig. 4. Training and testing loss functions vs. epochs.
Table 1 shows our results of sparsity and testing accuracy. It verifies what
we discussed on the thresholding parameter. That is, when the threshold grows
higher, the sparsity also grows correspondingly. When a is less than 0.1, we
achieve a sparsity of 86%, while the accuracy remains high. The key point should
be noticed is that these sparse networks achieve almost the same, or even better
accuracy than the non-sparse model. Thus we affirm that around 90% of the
parameters are redundant, as they hardly contribute to the accuracy of the
model.
Another data set we consider is the images of normal vs. shaky planar shapes
like triangles and quadrangles (not necessarily convex). It can be viewed as
another demonstration of PD patients’ handwriting, as what they draw are some-
how shaky, likewise the letters they write. This data set of 100×100 binary images
is simulated by adding random noise to the normal planar shapes. Figure 5 shows
some sample images of our shapes. The results on this data set are similar to
those of the first data set, as shown in Table 2. So RVSM also achieves high
accuracy and sparsity on multi-scale planar curve data.
More properties of our sparse networks are as follows. First, there is a remark-
able difference in distributions of the weights between the sparse and non-sparse
models. For the sparse model, most of the weights are zero, while the rest are
very close to zero. So its distribution looks like a vertical line plus some noise on
the interval close to zero. In our example of non-sparse model, it also has a peak
at zero. However, very few weights are exactly zero. Many of them are merely
close to zero, while a large proportion are far away from zero. What’s more, the
distribution of this non-sparse model seems to be bell shaped. The distributions

are shown in Fig. 6, where the weights are normalized for better viewing.
Fig. 5. Normal vs. shaky shapes.
Fig. 6. Distribution of weights: Sparse vs. Non-sparse networks
What we also notice is that, RVSM performs much better than applying SGD
directly to the TL1 penalized loss functions. As shown in Table 3, most of the
normalized weights in the SGD model are distributed between 10−5 and 10−3 . It
seems there is no apparent criterion to judge if a weight of 10−4 should be set to
zero or it does contribute to the network. However, for the RVSM method when
a = 0.01, it is clear that 8.7% of the weights are greater than 10−4 and 84.9% of
the weights are less than 10−10 . There is a significant gap between the two scales
of 10−4 and 10−10 , which makes it reasonable to set all the weights less than
10−10 to zero. This leads to a network of 84.9% sparsity. Another point worth
mentioning is that applying SGD directly to the penalized loss function may hurt
the accuracy a lot at a = 0.01, resulting in 96.7% accuracy for the model. This
is because when a is small, the penalized term behaves like 0 , which renders
the objective function nearly singular. RVSM resolves this issue by making the
penalty implicit to a thresholding process, which gives an accuracy of 99.5%.
Table 4 shows another interesting phenomenon. Since the weights are ran-
domly initialized with mean zero, there is roughly even split of plus/minus signs
Table 1. Testing sparsity and accuracy for the data of alphabets.
λ β a Penalty Sparsity (%) Accuracy (%)

0.0005 0.1 0 0 86.1 99.4
0.01 TL1 87.6 99.0
0.1 TL1 85.8 99.7
1 TL1 78.1 99.3
100 TL1 82.0 99.3
∞ 1 76.5 99.0
Table 2. Testing sparsity and accuracy for the data of planar shapes.
λ β a Penalty Sparsity (%) Accuracy (%)

0.0005 0.1 0 0 90.2 99.9
0.01 TL1 83.5 99.1
0.1 TL1 87.6 99.8
1 TL1 74.9 99.9
100 TL1 75.0 99.9
∞ 1 74.6 99.6
Table 3. Sparsity and accuracy: RVSM vs. Direct SGD for TL1 penalty
a Algorithm Sparsity (%) of 10−n scale Accuracy (%)

10−2 10−3 10−4 10−5 10−10
0.01 RVSM 99.7 96.0 91.3 88.6 84.9 99.5
0.01 SGD 99.9 99.9 45.9 5.44 10−5 96.7
100 RVSM 99.9 97.5 92.7 88.5 80.3 99.3
100 SGD 99.9 99.7 48.1 6.68 10−5 99.0
Table 4. Number of sign changes and relative % in kernels of convolutional layers.
a Layer 1 Layer 2 Layer 3

0.01 72 (25.0%) 1120 (12.2%) 769 (8.34%)
1 45 (15.6%) 1133 (12.3%) 784 (8.51%)
100 35 (12.2%) 1001 (10.9%) 995 (10.8%)
in all layers. At the end of training, we counted the number of sign changes in the
kernel of each convolutional layer, and found that more weights changed signs in
the first convolutional layer than in the next two layers. This is consistent with
the network filters structured towards low pass in depth after training.
4 Conclusions
In this paper, we have applied the RVSM algorithm to learn sparse neural net-
works. We have achieved an accuracy of 99% and a sparsity of 87% when training
CNNs on a data set consisted of synthetic handwritten letters and planar curves
by PD patients, and normal handwriting. We have also discussed the tuning
of thresholding parameters, and verified the fact that a higher threshold can
produce higher sparsity. What’s more, our experiments show that the RVSM
outperforms the direct application of SGD on the penalized loss function, in
both sparsity and accuracy. The RVSM generates a significant gap between the
weights of large scale and small scale, which acts as an indicator to show sparsity.
In future work, we plan to explore a wider variety of PD patient data and more
refined multi-class classification tasks.
Acknowledgements. The work was partially supported by NSF grant IIS-1632935.

The authors would like to thank Profs. Xiang Gao and Wenrui Hao at Penn State
Universty for helpful discussions of handwritings and drawings on neuropsychological
exams and diagnosis.
References
1. Blumensath, T., Davies, M.: Iterative thresholding for sparse approximations. J.
Fourier Anal. Appl. 14(5–6), 629–654 (2008)
2. Daubechies, I., Michel, D., De Mol, C.: An iterative thresholding algorithm for linear
inverse problems with a sparsity constraint. Commun. Pure Appl. Math. 57(11),
1413–1457 (2004)
3. Dinh, T., Xin, J.: Convergence of a relaxed variable splitting method for learn-
ing sparse neural networks via 1 , 0 , and transformed-1 penalties (2018).
arXiv:1812.05719
4. Louizos, C., Welling, M., Kingma, D.: Learning sparse neural networks through 0
regularization. In: ICLR (2018). arXiv:1712.01312v2
5. Simard, P., Steinkraus, D., Platt, J.: Best practices for convolutional neural net-
works applied to visual document analysis. In: Seventh International Conference on
Document Analysis and Recognition, pp. 958–963. IEEE (2003)
6. Yin, P., Zhang, S., Lyu, J., Osher, S., Qi, Y-Y., Xin, J.: Blended coarse gradient
descent for full quantization of deep neural networks. Res. Math. Sci. 6(1), 14 (2019).
arXiv:1808.05240
7. Yu, D., Deng, L.: Automatic Speech Recognition: A Deep Learning Approach. Sig-
nals and Communication Technology. Springer, New York (2015)
8. Zhang, S., Xin, J.: Minimization of transformed l1 penalty: closed form representa-
tion and iterative thresholding algorithms. Comm. Math. Sci. 15(2), 511–537 (2017)
Pattern Recognition with Using Effective
Algorithms and Methods of Computer Vision
Library
S. B. Mukhanov(&) and Raissa Uskenbayeva
International Information Technology University, Almaty, Republic of

Kazakhstan
kvant.sam@gmail.com, ruskenbayeva@iitu.kz
Abstract. This article discusses the possibility of using effective recognition

algorithms of using the OpenCV library in computer vision area. It describes the
algorithms of the Harris Corner Detection, the SURF (Speeded up robust fea-
tures), SIFT (Scale-Invariant Feature Transform) and FAST (Features from
Accelerated Segment Test) search algorithms, as well as their comparative
analysis of the performance and recognition quality. Contour analysis for pattern
recognition has also used. Applications of mathematical models and formulas
for binarization of the image.
Keywords: Computer vision Pattern recognition SURF SIFT FAST

Contour analysis
1 Introduction
Nowadays pattern recognition has become an exceptional theme in the science of

computer vision. There are many algorithms, technologies and ready-made templates
that are actively used in all areas related to computer and information technology.
Machine learning and neural networks are the basics and important elements in
computer vision. It is clearly seen that it would be impossible to teach a machine to
learn, understand and effectively use mathematical models to translate into the machine
code without these progressive technologies.
The main difference between a machine and a human is that the program cannot
recognize images abstractly, since it uses only mathematical calculations while the
human eye, as the most powerful graphics processor, can work without fail to recog-
nize objects in the whole environment.
Machine vision has been mainly used in the field of industry, such as creation of
robots and in special technologies for obtaining images of the signs of the object of the
real world. It particularly focuses on manufactures where visual inspection and mea-
surement of systems are widely used [1]. The technology of picture sensors and control
theory is related to the processing of video data for controlling the robot, and the
processing of the obtained data is carried out by controlling the software or hardware in
real time. Image processing and also image analysis are mainly focused on the work of
a model with a two-dimensional image, which means converting one image into

https://doi.org/10.1007/978-3-030-21803-4_81
Pattern Recognition with Using Effective Algorithms and Methods 811
another. For example, pixel-by-pixel operations of contrast enhancement, edge selec-

tion, noise removal, or geometric transformations, such as image rotation. These
operations show that image processing and analysis act independently of the content of
the images itselves.
Basic Concepts
The main tasks of the computer (machine) vision can be viewed below:
• Recognition
• Identification
• Detection
• Text recognizing
• Restoration of three-dimensional forms of two-dimensional images
• Motion Evaluation
• Restore Scene
• Image Recovery
• Selection on images of structures of a certain type, segmentation of images
• Optical flow analysis
The classical issue in the science of computer vision, image processing and
machine vision lies in defining whether a video data contains some characteristic
object, feature or activity. Actually this task can be reliably and easily solved by
human, but it has not been solved satisfactorily in computer vision in the general case
so far: random objects in random situations [2]. One or several preliminary defined or
studied objects or classes of objects can be recognized (usually together with their two-
dimensional position in the image or three-dimensional position in the scene). An
individual instance of an object belonging to a class is recognized. For example, the
identification of a specific human face or fingerprint, or car.
Text Recognition
Search for images by content: finding all the images in a large set of images that have
content defined in various ways. Position evaluation: determining the position or ori-
entation of a particular object relatively to the camera.
Image Recovery
Optical flow analysis (finding the movement of pixels between two images). Several
tasks related to the motion estimation, in which a sequence of images (video data) is
analyzed in order to find an estimate of the speed of each point of the image or 3D
scene. For example, determining the three-dimensional camera movement, tracking, i.e.
following the movement of an object (for example, cars or people).
Image Processing Methods
• Pixel count
• Binarization
• Segmentation
• Barcode reading
• Optical character recognition
812 S. B. Mukhanov and R. Uskenbayeva
• Measurement
• Edge detection
• Pattern matching
Pixel Count - counts the number of light or dark pixels. With the help of a pixel
counter, user can select a rectangular area on the screen in a place of interest, for
example, where he expects to see faces of people passing by [3]. Camera will
immediately respond by providing the data about the number of pixels represented by
the rectangle sides. The pixel counter allows you to quickly check whether the mounted
camera corresponds to regulatory or customer requirements regarding the pixel reso-
lution, for example, for the recognition of faces of people entering the doors, which are
monitored by the camera, or for the license plate recognition.
2 Binarization, Segmentation and Machine Learning
Binarization - converts an image in grayscale to binary (white and black pixels). The
values of each pixel are conventionally encoded as “0” and “1”. The “0” value is
conventionally referred to as background, while the “1” value is foreground. Often,
while storing digital binary images, a bitmap is used, where one bit of information is
used to represent one pixel.
Also, especially during the early stages of the technology development, the two
available colors were black and white, which is not mandatory.
The simple example of image binarization is illustrated below:
The cvAdaptiveThreshold function converts an image displayed in grayscale into a
monochrome image according [7] to the formulas below:

max value if srcðx; yÞ [ Tðx; yÞ
CV THRESH BINARY dstðx; yÞ ¼ ð1Þ
0 otherwise

0 if srcðx; yÞ [ Tðx; yÞ
CV THRESH BINARY INV dstðx; yÞ ¼
max value otherwise
ð2Þ
where T (x, y) is a threshold calculated individually for each pixel.

Segmentation - used for searching and (or) counting of details.
The goal of segmentation is to simplify and/or change the representation of the
image so that it is easier to analyze. Image segmentation is commonly used to highlight
objects and borders (lines, curves, etc.) on images. To be more precise, image seg-
mentation is the process of assigning specific labels to each pixel of an image, so that
pixels with the same labels have common visual characteristics. The result of image
segmentation is a set of segments that together cover the entire image, or a set of
contours, extracted from the image [6].
Machine learning continues to penetrate industries outside the internet branch.
During the Data & Science conference named “The World through the Eyes of
Robots”, Alexander Belugin from the “Tsifra” company told about the achievements,
challenges and urgent tasks in this way. The introduction of such technologies as
computer vision requires serialization and product-based approach, which allows to
reduce the cost of single implementations [3].
Today we are talking more about the computer vision. There is also a term “ma-
chine vision”, which refers to the technique. There are video cameras similar to those
which are used for video surveillance, there are webcams that are used for commu-
nications, and there are special cameras used in the industry. They differ because of
such features as an absence of normal Ethernet port, usage of special protocols, and an
ability to transmit, for example, 750 frames per second, and not in burst mode, but
continuously, without compression. There are special cameras with specific sensitivity
lying in different diapason than that optically visible to the eye.
2.1 Harris Corner Detection

Harris Corner Detection is a corner detection operator that is commonly used in
computer vision algorithms to highlight angles and determine image characteristics.
Harris’s multiscale operator describes matrix calculations:
1 ðx2 þ y2 Þ=2t
gðx; y; tÞ ¼ e ð3Þ
2pt
The following formula is used to determine Harris’ multi-scale corner

measurement:
Mc ðx; y; t; sÞ ¼ detðlðx; y; t; sÞÞ k trace2 ðlðx; y; t; sÞÞ ð4Þ
According to the source below, we use the differential equation formula to change
the brightness and matrix value (Fig. 1):
Fig. 1. The model of Harris corner detector [7]

Below, we obtained test (experimental) angle detection data using the Harris angle
recognition algorithm (Fig. 2):
Fig. 2. Image upload function of JPEG format picture and grey scale (binarization) Harris
corner detection
In the developed program, the same principle of coal determination is used, only
the maximum value is calculated for its own values. The maximum element and values
close to it are the corner points. It was experimentally determined that the threshold
value, at which the best result is achieved, is 70% of the maximum value. The
developed function has two formal parameters. The first one is the name of the jpg file,
the second is the threshold value [1]. The result of the execution is the coordinates of
the angles. As already mentioned, the Harris algorithm is one of the first values, but it
has a significant minus - a great deal of computational complexity. To increase the
speed of the original image is compressed, the coordinates of the corners are calculated,
after which the image takes its original size.
Exploring the properties of these algorithms, we reached experimental data. We made a

comparative analysis, after we concluded that the algorithms for searching for local
features in the image on all three different types of algorithms we get different data. The
rapid algorithm is the FAST recognition method and the most effective program has
identified a method for processing local features is the SURF algorithm. By running the
program, we get the data, which clearly shows how many signs were found in the
image, as well as how much key data we received as a result. Below you will see the
test results of all three algorithms (Fig. 3).
Fig. 3. The results of testing the recognition methods SURF, FAST and SIFT
3.1 Algorithms SIFT, FAST and SURF

SIFT is one of the modern methods of detecting and finding particular points in images.
When using this style, you can see that when recognizing images, there are a number of
problems. What are the problems?
The first is the scale, since each image has a different scale. This means that the
objects that we see the same, in fact, have different sizes in different paintings.
The second problem is the location, the object that is of interest to us may be
located at completely different places.
The third problem is the hindrance and background. Everything refers to the fact
that objects that we visually perceive as something separate, are not highlighted in the
pictures, that is, to be compared to other objects. Also, this image is not perfect and
may be subject to any kind of interference and distortion.
The fourth problem can be projection, rotation and geometric plane. That image is
only a two-dimensional projection of the three-dimensional world itself. Therefore, the
angle of rotation of the object and the change in viewing angle ultimately affect the
two-dimensional projection - the image. The same object can show a completely
different picture, depending on the angle of rotation or the distance to it. If you want to
know even more, you can read various articles on the construction of SIFT descriptors
and the problem of image matching [4] (Fig. 4).
FAST - is a method of corner detection, which can be used to extract characteristic
points, and then used to track and compare objects in many tasks of computer vision.
The FAST Angle Detector was originally developed by Edward Rosten and Tom
Drummond [1]. The most promising advantages of fast angle detectors are their
computational efficiency. Referring to its name, it is fast and really faster than many
other well-known feature extraction methods, such as Gaussian Difference, used by
SIFT, SUSAN, and Harris. Moreover, when a machine learning method is applied,
better performance can be achieved, which takes less time as well as the use of
computing resources. FAST angle detector is very suitable for real-time video pro-
cessing applications due to its high performance [2].
Fig. 4. The number of found points about these methods
In the course of writing and researching these algorithms, a lot of shortcomings

were revealed, it’s not so easy to work with the OpenCV library. The fact is that quite a
lot of time is taken by the installation, as well as the settings. I want to emphasize that
this even refers to the version of the library used. A number of new ones were changed,
replacing the old ones with the name and functions. Errors, such as these algorithms,
also worked on the old ones, but did not work on the new ones.
Below you will see the data obtained as a result of testing the SURF algorithm
(Figs. 5 and 6).
Fig. 5. Image recognition and search for objects method SURF.

Fig. 6. Image recognition and search for objects method SURF. Invariance of this method
SURF - is a reliable detector to recognize special places or local features. This

means that finding well-distinguishable points (“characteristic points”, “features”,
“special or pronounced points”) and the detection of such transformations that com-
bines the found points and verifies using pixel comparison. That is, SURF detection of
persistent image features. This method was first introduced by Herbert Bay in 2006,
which were used in computer vision tasks like object recognition or 3D reconstruction.
Also, SURF is used to search for objects in pictures and compare images [5].
4 Contour Analysis
Contour analysis is one way to recognize images. Developers can easily define the
contours of images and manage them using the OpenCV library. The cvFindContours
function helps you find the outlines of a graphic image.
With the help of the function cvFindContours, you can find and determine the
number of contours of the monochrome image. The first_contour pointer, which
contains the pointer to the first external loop itself, is filled with functions. Also, this
index can be contained by RUNNING THE CARD with undetected contours (if the
picture is full black). Using h_next and v_next links, it is possible to determine other
contours. OpenCV has a CvSeq structure that provides an interface for all dynamic
structures, the structure of which describes the sequence of patterns [4].
Small contours often interfere with the recognition of images. In order to solve this
problem, it is necessary to run through the whole circuit of contours, at the same time
check the dimensions of each contour and shift the cycles [7].
One of the characteristics of the contour, which is summed with all the pixels of the
contour, is the moment. He has the following definition:
Xn
mpq ¼ i¼1
I ðx; yÞxp yq ð5Þ
where p is the order of x, and q is the order of y. Order is the power that determines the
sum of the component with the other components displayed. Below we got the results
of testing the algorithm for obtaining the contours of an object, which are displayed in
Fig. 7. The contours of different objects are highlighted here. The contours of the red
circle are clearly expressed in each object.
Fig. 7. Contour recognition results or contour analysis
5 Conclusion
This paper has made a research on different types of recognition of algorithm patterns
and the definition of the contour of objects has been enhanced. It means that with the
help of basic Harris corner recognition algorithms and methods for determining local
features such as SURF, SIFT and FAST, the obtained algorithm recognizes contours in
every image. The issue was incomplete quality of all images. Furthermore, various
problems like blurring or merging, as well as interference and damage in the pictures,
were met. To clear up the obtained problems, Harris method function was used, then
the source image uploaded and converted to the gray (Load source image, Detector
parameters, Detect corners). The given results show the performance of recognition
algorithms. Contour analysis was implemented on the basis of a mathematical model,
which is indicated at 5-th formula. This algorithm works very efficiently, which can be
observed in Fig. 7. The computer vision - OpenCV technology (library) helps to deeply
understand the problem area, as well as flexible development in a non-trivial way. On
the basis of machine learning and neural networks, problems in the field of pattern
recognition are solved very effectively. Further, these methods and algorithms can be
optimized for more accurate or high results. In the future, this work will use the new
technology and programming languages. For example, the Python programming
language has a rich machine learning library. That allows to solve a number of problem
areas with the help of flexible, but effective development tools. We will not completely
supplant OpenCV C-based computer vision technology, however, we must offer the
latest technology and programming languages designed in this area.
References
1. Edward, R., Tom, D.: Machine learning for high speed corner detection. In: 9th European
Conference on Computer Vision, vol. 1, pp. 430–443 (2006)
2. Edward, R., Reid, P., Tom, D.: Faster and better: a machine learning approach to corner
detection. IEEE Trans. Pattern Anal. Mach. Intell. 32, 105–119 (2010)
3. Herbert, B., Andreas, E., Tinne, T., Luc, V.G.: SURF: speeded up robust features. Comput.
Vis. Image Underst. (CVIU) 110(3), 346–359 (2008)
4. Novikov, A.I., Sablina, V.A., Nikiforov, M.B., Loginov, A.A.: Contour analysis and image
superimposition task in computer vision system. In: Proceedings of 11th International
Conference on Pattern Recognition and Image Analysis: New Information Technologies
(PRIA-11-2013), vol. 1, pp. 282–285 (2013)
5. Ke, Y., Sukthankar, R.: PCA-SIFT: a more distinctive representation for local image
descriptors. In: CVPR, vol. 2, pp. 506–513 (2004)
6. Gary, B., Adrian, K.: Learning OpenCV 3 Computer Vision in C++, 1 edn. ISBN-13: 978-
1491937990. O’Reilly Media (2016)
7. Kruchinin, A.: Pattern Recognition with Using OpenCV Library. Кpyчинин A.Ю. http://
recog.ru (2013)
The Practice of Moving to Big Data on the Case
of the NoSQL Database, Clickhouse
Baktagul Imasheva1,3(&), Azamat Nakispekov2,3,

Andrey Sidelkovskaya3, and Ainur Sidelkovskiy3
1
International Information Technology University, Almaty A15M0F0,
Kazakhstan
b.imasheva@iitu.kz, b.i@a2data.ai
2
German-Kazakh University, Almaty A26C7F8, Kazakhstan
3
JSC “A2 data”, Almaty A05A1D7, Kazakhstan
Abstract. In the modern world, every technology and user generate a large
amount of data. Each data carries value to some degree. Therefore, the concept
of big data is actively developing because the idea of big data is to generate a
new value. Addressing big data is an invocation and time-demanding job that
needs a large computational infrastructure to ensure successful data processing,
storage, and analysis. This report is intended to compare how one of the big data
storage, Clickhouse, can replace the relational database, Oracle. This paper
motivation is to obtain an understanding of the benefit and drawbacks of NoSQL
database, in the case of Clickhouse to supporting a huge amount of data.
Keywords: Big data Big data value chain Data storage NoSQL
Clickhouse Column database
1 Introduction
Big Data is a phenomenon defined as the rapid acceleration in the expanding volume,
high velocity, and diverse types of data. Big Data is often defined along three
dimensions such as volume, velocity, and variety [5]. All three specifications influence
on the choice of data storage. NoSQL database can be a solution in the data storage.
Therefore, this raises the research question:
Can NoSQL DBMS Clickhouse serve as a data store layer for Big Data?
In order to provide an answer to this question following objectives are defined:
• to analyse available literature and to define terms such as “Big Data”, “Data Storage
layer” and data storage technologies
• to define a list of parameters for comparison
• to conduct the test regarding defined parameters and formulate research results
The aim of this paper to concentrate on one of the big data storage technologies and
compare how this technology, Clickhouse, can replace the relational database, Oracle.
In order to archive this purpose, firstly literature review was done. This part tried to
understand the meaning of “Big Data” and stages of the Big Data value chain; and the

https://doi.org/10.1007/978-3-030-21803-4_82
The Practice of Moving to Big Data on the Case 821
data storage technologies. In order to compare Oracle and Clickhouse, in the

methodology, part comparison parameters were defined. The following part shows the
result of the comparison.
2 Main Part
2.1 Defining Big Data

The “Big Data” stopped to be an emerging field it more started to be “buzzword”.
However, everyone has their own understanding of “Big Data”. Fifteen percent of the
data today is a structured data which can be effectively stored in a relational database
with columns and rows. The rest of the data is unstructured data. Unstructured data can
be in forms of email, video, blogs, call centre conversations, social media, and logs.
Information delivering by devices, such as sensors, tablets, and mobile phones is
growing day by day. Social networking is also growing at an accelerated speed and it
defines new ways of interaction among people [5].
This paper analyses the use of NoSQL databases as the storage layer for Big Data.
Before considering the NoSQL database as data storage for big data, the term “Big
Data” and stages of the big data chain should be defined. The literature review defines
what kind of data can be considered as big data and steps of the development of big
data. Also, this section covered different options for data storage systems.
Big Data is defined by the fast acceleration in the growing volume of high speed,
complexity, and differing sorts and source of data [5]. Different resources define “Big
Data” with the use of 3V’s, 4V’s or 6V’s.
The first mentioning of 3V was by Douglas Laney in his paper “3D data man-
agement: controlling data volume, velocity, and variety” where he noticed that due to
the flow of e-commerce activities, data has grown along three dimensions such as
Volume, velocity, and variety. The volume represents the increasing amount of data.
Velocity represents how data arrives at the system. While, variety shows the diversity
of the incoherent and incompatible data types and structures [1, 5]. The figure below
illustrates the dependency among volume, velocity, and variety [10].
Initially, big data was defined by the 3V model. Later, dimensions such as veracity,
variability, and low-value density were used to define big data [6]. IBM added another
attribute, “Veracity” [7]. Veracity refers to the unreliability of the data. While SAS
added an additional dimension “Variability”. Variability is the inconsistency in the rate
of data flow [6]. Also, “Low-level density” was defined as the attributes of the Big data.
Data sometimes needs additional steps or analysis to generate value [8]. To maximize
business value, Microsoft extended this list by adding Visibility. Visibility supports the
decision-making process by providing a full picture of data [9].
Characteristics of the 3Vs raise challenges in handling with scales, size, and
complexity of data [4]. According to Loukides, Big Data is when data itself turns out to
be a piece of the issue and conventional techniques for dealing with this kind of data
exhausted [2]. While Jacobs characterize big data as the size of the data which enforce
us to see beyond our eyes [3].
822 B. Imasheva et al.
Big Data Value Chain

To sum up, Big data is the volume, velocity, and/or variety of information assets that
require an unconventional way of processing that enhances decision-making, discovers
insight and optimise the process [1]. A common way to illustrate the process of value
generation is known as the Big Data value chain.
There are also several approaches in the Big Data value chain. Curry (2014) defined
five stages of Big Data analysis illustrated in the figure below (see Fig. 1). Data
Acquisition is the step of data gathering, filtering and cleaning before passing data to
Data Storage level. However, there are two more steps before Data Storage, such as
Data Analysis and Curation [11].
Fig. 1. The big data value chain [11].
While another resource refers to the big data value chain which adds value at each step
of delivering data. It represented by seven phases: data generation, data collection, data
transmission, data pre-processing, data storage, data analysis and decision making [8].
Fig. 2. Alternative big data value chain [12].

Lehmann, etc. (2016) defined four sequential phases: data generation, data acqui-
sition, data storage, and data analytics. The initial four layers of the reference structure
relate to the process steps of this Big Data value chain while the last one delivers
valuable outcomes (see Fig. 2) [12].
Data Storage Layer
All three presented chains have the storage layer before the final step, value generation.
The perfect big data storage system would permit storage of an unlimited amount of
data, adapt with high rates of the insert and select; be flexible and effective in managing
with the different data types; and deal both structured and unstructured data. In addi-
tion, the data presented in the storage layer should be encrypted [4].
Big data storage technologies should also address the 6Vs challenges and “do not
fall in the category of relational database systems” [4]. The relational database system
can address these challenges, but unconventional storage technologies such as
columnar stores or Hadoop Distributed File System (HDFS) can be effective and be
cheaper [4]. According to [8], RDBMS with plain SQL data analysis techniques
struggles to process the increasing amount of data, especially processing the unstruc-
tured data types. Because of the Atomicity, Consistency, Isolation, and Durability
(ACID) constraints, large data scaling in RDBMS is impossible and dealing with semi-
structured and unstructured data is impossible [8]. These restrictions of RDBMS led to
the concept of NoSQL. NoSQL, “schema-free” databases, supports unstructured data
and enables a quick update without rewrites. NoSQL represents document stores, key-
value stores, column stores, and graph database. Data management and data storage
functionalities are separated in the NoSQL databases, which permit the scalability of
data [8]. For example, in the column-oriented database, “data from a given column are
stored together”, which allows flexible scaling and “each row can have a different set of
columns that allow tables to be sparse” with no extra spending [10].
The big data storage systems should ensure a durable storage area and effective
access to the data. “The distributed storage systems for big data should consider factors
like consistency (C), availability (A) and partition tolerance (P)”. However, the CAP
theory claims that not all requirements can be implemented simultaneously in one
technology [8]. “Hadoop is good at managing fewer very large files”, but when there is
a lot of small files it raises the issues in performance and additional ETL steps to
combine this small files [8, 13]. Hadoop, also, does not support “query optimiser”
which can lead to “an inefficient cost-based plan” [8]. NoSQL databases are created for
scalability sometimes by sacrificing consistency [4]. This article is aimed to compare
NoSQL database with relational database, in the case of Clickhouse introduced by
Yandex company. As the object of the study was chosen the project which is immi-
grated from Oracle to Clickhouse to generate big data.
3 Methodology
To identify features and finding deviations, the list of indicators for comparison should
be determined. For complex comparison of the DBMS Clickhouse with object-
relational database management system articles with a comparative analysis were
analysed. These articles tried to identify the strengths and weaknesses of the software.
They, also, helped to determine the direction of the software research.
Since Clickhouse on the market from 2009, there are already comparative analyses
with other products like the Apaches Spark. Alexsander Rubin in article “Clickhouse in
a General Analytical Workload (Based on a StarSchema Benchmark)” compares three
software products such as MariaDB Column Store version1.0.7 (based on InfiniDB),
DBMS Clickhouse and Apache Spark [14]. The purpose of this research is to show
how column BDSM is better in compression and in performance. Compression is
defined an effective use of the disk space. While, the performance is the speed at which
the software product performs query. However, it is not enough to consider just two
parameters in order to get insides. In addition, this research compared different types of
database. While Yishan Li and Sathiamoorthy Manoharan analysed the same type of
database, key-value, using the CRUD (Create, Read, Update, Delete) model [15]. This
research underlines that analysing the database regarding their main feature with their
relatives helps to distinguish the fastest and optimized database [15]. Based on this, we
also compare the optimized function of column-oriented DBMS, column-reading
speed, in contract with row reading.
Company AltinityDB conducted test to check Clickhouse stability in time series.
Time Series Benchmark Suite was developed by InfluxDB engineers and modified by
Timescale team [16]. Clickhouse was not prepared for such kind of tests, but it showed
good results using several nodes, and even in some cases faster than TimescaleDB and
InfluxDB. This specific test showed that architectural features of the database to ver-
tical extensibility should be considered as one of the parameters of the comparative
analysis. Moniruzzaman A. and Hossain S. performed research that identified the
architectural differences of the various NoSQL databases [17]. Parameters such as
sharding, replication, programming language, horizontal/vertical extensibility, and
query language were used as parameters of analysis.
Roman Leventov made the compared analytical models of the databases, Click-
house, Apache Druid and Apache Pinat [18].
After an analysis of sources, the list of indexes was indicators. Table 1 summarises
the parameters of the further research.
Table 1. List of indicators

Indicators group Indicators Type of indicators
Architectural Sharding opportunity Descriptive
Replication opportunity Descriptive
Programming language Descriptive
Horizontal/vertical extensibility Descriptive
Query language Descriptive
Query engine Speed of read Measuring (s)
Speed of insert Measuring (s)
Speed of update Measuring (s)
Speed of delete Measuring (s)
(continued)
Indicators group Indicators Type of indicators
Speed of join Measuring (s)
Speed of count Measuring (s)
The main differences The method of reading data Descriptive
Type of table engines Descriptive
Software limitations Descriptive
Analytical aspects Descriptive
Architectural indicators will help to determine the optimality of the current system
for projects and its compatibility with other products. Query engine shows main
parameters of data processing. The last part provides general distinctions of column
database.
4 Results and Findings
For testing, the following parameters were determined:

• Software parameters are software version description and release;
• Hardware parameters are the version of OS, the core of CPU, RAM, Hard drive size
and type;
• Describe the dataset (Table 2).
Describe the comparable Software
• Oracle 11 g release 2 version 11.2.0.1

• Clickhouse version 18.14.10
Table 2. Hardware differences

Indicator Value of clickhouse Value of oracle
OS RedHat version 7 Microsoft windows server 2012 Standard
CPU 12 core 8 core
RAM 128 Gb 20 Gb
Hard drive 4 Tb 19 Tb
Hard drive type HDD HDD
As the dataset was used one of the systems which in Oracle dump file size is 1.4 Tb
after unzip 1.6 Tb, while in Clickhouse 177 Gb.
The Table 3 shows the result of analysis.
Table 3. The result table

Indicators Indicators Description
group Clickhouse Oracle
Architectural Sharding Yes Yes
opportunity
Replication Yes Yes
opportunity
Programming C++ Java, C, C++
language
Horizontal/vertical Yes Yes
extensibility
Query language Modify SQL SQL/PL SQL
Query Speed of read 7.96 s 18226.225 s
engine Speed of insert 528.634 s 6331.938 s
Speed of update 0.017 s 11098.508 s
Speed of truncate 0.871 s 0.914 s
Speed of join 2879.06 s 356.53 s
Speed of count 0.187 s 67.944 s
The main The method of Column reading Row reading
differences reading data
Type of table Log, merge tree Nonclustered table,
engines table partition, table
cluster
Software Date can’t be less than 1970 Datatype limits
limitations Does not support the multi- Logical database
join limits
Does not support procedures Process and runtime
and functions limits [20]
Analytical aspects Support data sampling; Standard math
Approximate computation of functions and
aggregate function such as analytical functions
quantiles, medians and by over partition
number of unique values; By
retrieving less data from the
disk based on sample data
make faster calculation;
Runs aggregation for a
limited number of random
keys which allows to get
accurate result with less
resources;
Support “Array” data type
and functions. This type
allows easy managing data
without compromising
performance [19]
Based on formed indicators DBMS Clickhouse is leading because of a columnar

architecture. Column reading is faster than row reading. This feature gives higher
performance and allows in a short time to process more data. The result of testing query
engines presented in Table 3. Also, DBMS Clickhouse supports sharding that has a
positive effect on performance. The horizontal and vertical scaling allows connecting
more data sources and users. Data replication is supported through Apache ZooKeeper.
However, the DBMS Clickhouse has some limitations, that was defined during the
exploitation phase. For example, an inability to record the date below 1970/01/01 to
“Date” and “DateTime” type field. This lack can be solved by saving as a “String”.
Clickhouse does not support the multi-join, but it can be solved by using subqueries. It
is also necessary to notice that the analytical module of the DBMS Clickhouse, because
of Yandex positions Clickhouse as an analytical DBMS. The analytical module allows
more complex calculations than Oracle. For example, a query based on a sample of
data provides an approximated result. In this case, proportionally less data is retrieved
from the disk. Clickhouse also has additional data type “array” which provides high-
level processing. However, Clickhouse does not support procedures and functions,
while Oracle has PL/SQL. In conclusion, the Clickhouse supports almost all the
functions of Oracle, at the same time it is free and frequently updated. This DBMS can
be the optimal option for the data storage layer for Big Data.
5 Conclusion
To conclude, the NoSQL database was introduced one of the technologies to store Big
Data. As one of the presenters of the NoSQL database was chosen Clickhouse. The aim
of this paper to compare and contrast Clickhouse can replace the relational database,
Oracle. To archive this purpose the following objectives overcame:
• existing literatures was analysed and the main term were defined;
• the list of parameters for comparison was set;
• the comparison analyse was conducted and results of analysis was made.
As result, Clickhouse database concentrates on analytical processing of large scope
datasets, suggesting increased scalability over commodity hardware. Clickhouse shows
the ability to accumulate and index arbitrarily big data while allowing a large number
of simultaneous user requests. Clickhouse database has the ability for analytical data
processing. It, also, should be understood that Clickhouse DBMS has not only archi-
tectural features but also distinctive features such as running a query based on a sample
data to get an approximate result and using aggregation for a limited number of random
keys. In order to archive the better performance Clickhouse sacrifice the accuracy of
calculations. Moreover, this is acceptable for the analytical database.
References
1. Laney, D.: 3-D Data Management: Controlling Data Volume, Velocity, and Variety. META
Group Res Note 6, Stamford (2001)
2. Loukides, M.: What is Data Science. O’Reilly Media (2010)
3. Jacobs, A.: The pathologies of big data. Commun. ACM 8(52) (2009)
4. Cavanillas, M., Curry, E., Wahlster, W.: New Horizons for a Data-Driven Economy: A
Roadmap for Usage and Exploitation of Big Data in Europe. Springer Open, Cham (2016)
5. TechAmerica Foundation.: Demystifying Big Data: A Practical Guide To Transforming The
Business of Government. TechAmerica Foundation, Washington (2012)
6. Gandomi, A., Haider, M.: Beyond the hype: big data concepts, methods, and analytics. Int.
J. Inf. Manag. 2(35), 137–144 (2015)
7. IBM Analytics. https://www.ibmbigdatahub.com/infographic/four-vs-big-data. Last acces-
sed 05 Feb 2019
8. Bhadani, A., Jothimani, D.: Big data: challenges, opportunities and realities. In: Singh, M.K.,
Kumar, D.G. (eds.) Effective Big Data Management and Opportunities for Implementation
2016, pp. 1–24. IGI Global, Pennsylvania (2016)
9. Rajkumar, B., Rodrigo, C.A., Vahid, D.: Big Data Principles and Paradigms. Morgan
Kaufmann, Cambridge (2016)
10. Sakr, S.: Big Data 2.0 Processing Systems: A Survey. Springer Publishing Company,
Incorporated (2016)
11. Curry, E., Freitas, A., Ngonga, A.: D2.2.2 Final Version of Technical White Paper. Big Data
Public Private Forum, pp. 2–8 (2014)
12. Lehmann, D., Fekete, D., Vossen, G.: Technology Selection for Big Data and Analytical
Applications. European Research Center for Information Systems No. 27. (2016)
13. Cloudera Engineering Blog. https://blog.cloudera.com/blog/2014/09/getting-started-with-
big-data-architecture/. Last accessed 04 Feb 2019
14. Rubin, A.: Column Store Database Benchmarks: MariaDB ColumnStore vs. Clickhouse vs.
Apache Spark. https://www.percona.com/blog/2017/03/17/column-store-database-benchmarks-
mariadb-columnstore-vs-clickhouse-vs-apache-spark/. Last accessed 17 Jan 2019
15. Yishan, L., Sathiamoorthy, M.: A performance comparison of SQL and NoSQL databases.
In: Communications, Computers and Signal Processing. New Zealand (2013)
16. Altunity. ClickHouse for Time Series. https://www.altinity.com/blog/clickhouse-for-time-
series. Last accessed 05 Jan 2019
17. Moniruzzaman, A., Hossain, S.: NoSQL database: new era of databases for big data analytics-
classification, characteristics and comparison. Int. J. Database Theory Appl. 6(4) (2013)
18. Leventov, R.: Comparison of the Open Source OLAP Systems for Big Data: ClickHouse,
Druid and Pinot. https://medium.com/@leventov/comparison-of-the-open-source-olap-
systems-for-big-data-clickhouse-druid-and-pinot-8e042a5ed1c7. Last accessed 07 Jan 2019
19. Yandex. Distinctive Features of ClickHouse. https://clickhouse.yandex/docs/en/introduction/
distinctive_features/. Last accessed 07 Jan 2019
20. Oracle. Database Limits. https://docs.oracle.com/cd/B28359_01/server.111/b28320/limits.
htm#REFRN004. Last accessed 07 Jan 2019
Economics and Finance
Asymptotically Exact Minimizations
for Optimal Management of Public
Finances
Jean Koudi1 , Babacar Mbaye Ndiaye2(B) , and Guy Degla1

1
Institute of Mathematics and Physical Sciences, University of Abomey Calavi,
Porto-Novo, Benin
{jean.koudi,gdegla}@imsp-uac.org
2
Laboratory of Mathematics of Decision and Numerical Analysis,
University of Cheikh Anta Diop - Dakar, 45087 Dakar-Fann, Senegal
babacarm.ndiaye@ucad.edu.sn
Abstract. The algorithms for asymptotically exact minimizations in

Karush-Kuhn-Tucker methods recently published have been considered
to be effective on linear or non-linear optimizations problems, differen-
tiable and under inequality constraints. The algorithms conceptions as
well as the test results on reference and academic problems are published
in [1, 2]. The purpose of this paper is to use these algorithms to solve a
specific large-scale problem: the optimal management of public finances.
We give a formal study on the design of the models interpreting this prob-
lem and solve it thanks to our algorithms to determine at each moment,
the optimal recipe and the optimal expenditure that the Government of
a State must realize in order to achieve its goals. The numerical results
obtained testify the efficiency of our algorithms on large-scale problems.
Keywords: Augmented Lagrangian methods ·

Numerical experiments · Approximate KKT point · Public finances ·
Adjustment costs
1 Introduction
The optimal management of public finances is a problem involving several param-

eters such as: the tax rate, the interest rate, the growth rate of the economy,
the GDP per capita, etc. and is subject to several constraints such as: revenue
constraint, spending constraint, Inter-temporal Budget Constraints (IBC), etc.
The main actor (the Government) check an optimal decision for management of
the resources of its state in order to achieve its objectives which are: to amortize
the economic oscillations, to increase the degree of employment, etc.
On these concerns, several researchers like Barro (1979) in [3], Roubini and
Sachs (1989) in [5], etc. have proposed models that can interpret the phe-
nomenon. The model that we will use in this work is that of Jean-François and

https://doi.org/10.1007/978-3-030-21803-4_83
832 J. Koudi et al.
Eric published in 1992 in [4]. The modeling of the optimal management of pub-
lic finances gives an optimization problem of the same type as the problem (15)
whose constraints are all inequalities constraints and differentiable. This type of
problem is a good candidate for testing our algorithms. A theoretical resolution
assuming that all constraints are inactive was made in [4]. This makes it possible
to solve only the system (17). We will return to this resolution and then we will
numerically resolve the problem in the general case where the constraints are
active or nonactive. It is clear that the theoretical (or manual) resolution of the
system (16) in the case where the constraints are active or nonactive is almost
impossible because of the multiplicity of sub-sythems generated by the exclusion
condition.
We briefly describe the formulation of the public finances management prob-
lem as follows (see [4], for more details):
The objective function: The government has a goal (a long-term target) on
both expenditures and revenues (g ∗ , t∗ ). The existence of significant adjustment
costs leads the government to gradually correct its situation (in terms of revenue
and expenditure) so as to reach its target, in the absence of constraints. We admit
here an exponential adjustment of the recipes of the form:
topt
s = θ1 ts−1 + (1 − θ1 )t∗ (1)
where θ1 ∈ [0, 1], and ts means public revenue divided by GDP-per capita at time
s. In the same way, we admit here an exponential adjustment of the expenses of
the form:
gsopt = θ0 gs−1 + (1 − θ0 )g ∗ (2)
where θ0 ∈ [0, 1], gs means public expenditure divided by GDP-per capita at

time s.
The government bears on the revenue an instantaneous cost quadratic com-
pared to the optimal trajectory which is written at each time s by:
ct (s) = (1 − c)(ts − topt )2 = (1 − c)[(ts − t∗ ) + θ1 (ts−1 − t∗ )]2 (3)
In a similar way, the government bears a symmetrical cost on the expenditure

side of that on the public revenue, which is written at each time s by:
cg (s) = c(gs − g opt )2 = c[(gs − g ∗ ) + θ0 (gs−1 − g ∗ )]2 (4)
the cost function that be minimized is:

∞ s
1+n
f (s, gs , ts ) = (1 − c)[(ts − t∗ ) + θ1 (ts−1 − t∗ )]2 (5)
s=1
1 + τ

+ c[(gs − g ∗ ) + θ0 (gs−1 − g ∗ )]2

where 1+n
1+τ is the discount factor expressed in terms of GDP-per capita. The
value n is the nominal growth rate of the economy in “long run”, forecast be
Asymptotically Exact Minimizations for Optimal Management 833
constant for s > 0, and τ is the discount rate. Without adjustment costs, the
objective function of the government becomes:
∞ s
1+n
f (s, gs , ts ) = (1 − c)(ts − topt )2 + c(gs − g opt )2 (6)
s=1
1+τ
The constraints: The constraints are multiple. The government is subject to

constraints on expenditures g and revenues t: public spending is always positive
and the tax rate is lower than 1. Additionally, a low level of public spending will
have the consequence a decline in the growth rate, as well as a high tax rate. The
aim of this work is not to study these interdependencies, it is assumed that they
are negligible in the vicinity of the target. It is therefore assumed, to simplify,
that there is a level of incompressible public expenditure ginf and a maximum
tax rate tsup , the latter being, for example, the rate that maximizes revenues
in the Laffer curve. So at the time s we have:

ts ≤ tsup
(7)
−gs ≤ −ginf
In addition to all this the government is also subject to Inter-temporal Budget
Constraints (IBC). At the instant s this constraint in the case where everything
is perfect, is written:
Bs − Bs−1 = Gs − Ts + rBs−1 = Ds + rBs−1 (8)
where Bs , Ts and Gs denote respectively the public debt, the public revenue and
the public expenditure at the time s. r is the interest rate on the public debt.
Following this, the inter-temporal budgetary constraint (IBC) of the government
in part of GDP-per capita is written:
Bs Bs−1 Ds rBs−1
− = + (9)
Gdps (n + 1)Gdps−1 Gdps (n + 1)Gdps−1
Bs
where Gdps = Gdps−1 + nGdps−1 = (n + 1)Gdps−1 . Denote by: Gdps = bs
bs−1
and Ds
Gdps = ds , we have: bs − (n+1) = ds + r
(n+1) bs−1 . We obtain
r+1
bs = d s + bs−1 (10)
n+1
Let’s solve Eq. (10): remember that b−1 is initial debt in part of GDP-per
capita of the Government.
r+1
b0 = d 0 + b−1 (11)
n+1
2
r+1 r+1 r+1
b1 = d1 + b0 = d 1 + d0 + b−1 (12)
n+1 n+1 n+1
..
.
m m−s m+1
r+1 r+1
bm = ds + b−1 (13)
s=0
n+1 n+1
834 J. Koudi et al.
−m
r+1
Multiplying each member of the Eq. (13) by n+1 , we obtain
m m
s
n+1 n+1 n+1
bm = ds + b−1 (14)
r+1 s=0
r+1 r+1
Let’s put α = n+1

r+1 . In the case where the interest rate is higher than the rate of
growth of the economy (n < r), we obtain at the infinite horizon the following
∞ ∞

relation: b−1
α = α s
(−d s ) = αs (ts − gs ). In reality, we can distinguish four
s=0 s=0
cases of constraint (IBC):
(i) The government is not able to achieve its objectives:

In this case, even if it is immediately in the most favorable situation (max-
imum tax, minimum expenditure), the government is unable to repay its
∞
b−1
initial debt. We have: αs (tsup − ginf ) < . In this case we will say
s=0
α
that the Government is not rich.
(ii) The government is able to realize its objectives immediately:
∞
b−1
We have: αs (t∗ −g ∗ ) > . We will say in this case that the Government
s=0
α
is rich enough.
(iii) The government is not rich and the budget constraint is active:
∞
We have: b−1α = αs (ts − gs ). We can also distinguish a fourth case that
s=0
can generalize the first and third cases as follows:
(iv) The government is not rich and the budget constraint (IBC) is
∞
b−1
not necessarily active: We have: αs (ts − gs ) ≤ .
s=0
α
To sum up, the problems to be solved fall into two categories: the case where
the Government does not have an adjustment cost and the case in which the
Government have it. According to the intertemporal budget constraint, each
category has three types of problems, as follows:
Category 1: the Government doesn’t have an adjustment cost; this category
groups the problems № 1, 2 and 3. Category2: the Government has an adjustment
cost; this category groups the problems № 4, 5 and 6.
In this paper, we solve any optimization problem of type
(P ) : min f (x) (15)

x∈K

with K = x ∈ Rn : gi (x) ≤ 0 ∀ i ∈ I = {1, . . . , m} where f and all of gi are
differentiables, with Karush-Kuhn-Tucker’s constraints. A first-order condition
for a point x∗ to be optimal is that it satisfies the following system:

⎧
⎪
⎪
m
⎪
⎨ ∇f (x) + λi ∇gi (x) = 0
i=1 (16)
⎪
⎪ λi gi (x) = 0 ∀ i ∈ I (exclusion condition)
⎪
⎩
λi ≥ 0 ∀ i ∈ I
When the constraints of the problem are equality constraints, the exclusion con-
dition in (16) becomes obvious. So, find a solution of the problem (15) is to find
a point satisfying the system (16) without the exclusion condition. Our termi-
nology is based on [1,2]. Additionally, when all the constraints of the problem
(15) are not actives at an optimal point, according to exclusion condition, all
KKT multiplicators are equal to zero (λi = 0 ∀ i). In this case, we have:
∇f (x) = 0 (17)
Equation (17) will be somehow to solve. In this work, we will solve the problems
following each case (all the constraints are inactives that is to say gi (x) < 0 ∀ i
or the constraints are not necessarily actives gi (x) ≤ 0 ∀ i). We will compare
the solutions that will be obtained in each case and see the solutions that related
the reality. The principle of ours algorithms is to find x ∈ K such that
PΩ [x − ∇x L(x, λ, ρ)] − x∞ = 0 (18)
where L(x, λ, ρ) is augmented lagrangian associated to problem (15) and

∇x L(x, λ, ρ) = ∇f (x) + (λi + ρi gi (x))∇i gi (x) (19)
i∈I
with PΩ is the projection operator on a closed set Ω such that K ⊂ Ω.

The algorithms as well as results of the tests on reference and academic problems
were published in [1]. As stated in the abstract, the goal of this paper is to test
the news algorithms on a large-scale problems.
The following section describes the main results for solving the problems,
where the theoretical results and numerical experiments are presented. Finally,
summary and conclusions are provided in Sect. 3.
2 Main Results
For the resolution of each problem, we will define a limit of size N which can be
chosen very large depending on whether we decide to know the optimal values by
year or by month or by days. Recall that each value topt s represents the optimal
value on the date s. Each of the problems depended on two variables (quantities)
which are the recipe t and the expenditure g. Let’s define a new vector x = (t, g).
Our problems will depend only on this vector x. We consider a finite horizon
where time is discretized as S = {0, 1, . . . , N −1}, we have t = {t0 , t1 , . . . , tN −1 },
836 J. Koudi et al.
g = {g0 , g1 , . . . , gN −1 } and x = {t0 , t1 , . . . , tN −1 , g0 , g1 , . . . , gN −1 }. Then x =

{x0 , x1 , . . . , x2N −2 }. The constraints are linear and defined by:
hi (x) = xi − tsupi ∀ i ∈ {0, 1, . . . , N − 1} (20)

hi (x) = −xi + ginfi−N ∀ i ∈ {N, N + 1, . . . , 2N − 1} (21)
Let’s
denoteI = {0, 1, . . . , 2N − 1} and h1 (x) = (hi (x))i∈I . We have: ∇h1 (x) =
IN O
where IN is the unitary matrix of order N . Apart the feasibility
O − IN
conditions, the problems № 1 and 4 have the same constraints. The problems №
2 and 5 have in addition those equality constraints defined by:

N −1
b−1
h2 (x) = αs (xs − xN +s ) − =0 (22)
s=0
α
2.1 Theoretical Results

Recall that without the feasibility conditions, the problems № 1 and 4 have the
same constraints. Similarly, the problems № 2 and 5 have the same constraints
as well as the problems № 3 and 6. Applying KKT optimality condition to each
category of problems, we obtain the following systems:
KKT optimality condition for the problems № 1 and 4
⎧
−1
⎪
⎪
2N
⎪
⎨ ∇f (x) + λs ∇h1s (x) = 0
s=1 (23)
⎪
⎪ λ h1 (x) = 0 ∀ s ∈ I
⎪
⎩ s s
λs ≥ 0 ∀ s ∈ I

⎧
−1
⎪
⎪
2N
⎪
⎪ ∇f (x) + λs ∇h1s (x) + μ∇h2 (x) = 0
⎪
⎪
⎪
⎨ s=1
λs h1s (x) = 0 ∀ s ∈ I (24)
⎪
⎪ h2 (x) = 0
⎪
⎪
⎪
⎪ λ ≥0 ∀s∈I
⎪
⎩ s
μ∈R

⎧
−1
⎪
⎪
2N
⎪
⎪ ∇f (x) + λs ∇h1s (x) + λ ∇h2 (x) = 0
⎪
⎪
⎪
⎨ s=1
λs h1s (x) = 0 ∀ s ∈ I (25)
⎪
⎪ λ h2 (x) = 0
⎪
⎪
⎪
⎪ λ ≥0 ∀s∈I
⎪
⎩ s
λ ≥0
Consider the assumptions made by Loué Jean-François and Jondeau Eric since
1992 in [4] (that is to say, the revenues will never reach their fixed higher level
and that the expenditures will always be beyond their lower fixed level) about
the problems № 2 and 5 we have h1s (x) < 0 for all s ∈ I. With this assumption
we can easily solve the problems because λs = 0 ∀s ∈ I (KKT conditions).
Mathematically, these optimal solutions will be interior points. So, the system
(24) becomes:
⎧ s
⎪ 2γ (1 − c)(ts − t∗ ) + μαs = 0 ∀ s ∈ I
⎧ ⎪
⎪
⎨ ∇f (x) + μ∇h2 (x) = 0 ⎪ s ∗
⎨ 2γ c(gs − g ) − μα = 0 ∀ s ∈ I
s
b−1
h2 (x) = 0 ⇐⇒ αs (ts − gs ) − =0 (26)
⎩ ⎪
⎪
μ∈R ⎪
⎪
α
⎩ s∈I
μ∈R
which give:
s s
μ α μ α
ts − t ∗ = − , gs − g ∗ = ,
2(1 − c)γ 2c γ
s
μ α
t s − gs = + (t∗ − g ∗ )
2c(c − 1) γ
Replace ts − gs in the equality constraint in the system (26), we obtain the
following equation.
μ
s
α

b−1
αs + (t∗ − g ∗ ) − =0 (27)
2c(c − 1) γ α
s∈I
and

∗ ∗
2c(c − 1) b−1 − (t − g ) α s+1
s∈I
μ= α2s+1 (28)
γs
s∈I
Thus ts and gs are now determined explicitly by:

s
∗ ∗
c α
γ b−1 − (t − g ) α s+1
ts = t∗ −
s∈I
α2s+1 (29)
γs
s∈I

s
∗ ∗
(c − 1) α
γ b−1 − (t − g ) α s+1
∗ s∈I
gs = g + α2s+1 (30)
γs
s∈I
838 J. Koudi et al.
Let’s recall that for 0 < α < 1,

∞
α
αs = . (31)
s=1
1−α
So under the assumption that the discount rate is lower than the interest rate
(τ < r) and that the growth rate is lower than the interest rate (n < r). By
∞ ∞
α2 α2s+1
applying the formula (31), we obtain: αs+1 = and =
s=1
1−α s=1
γs
α3 s
Furthermore αγ converge to zero. This leads to the conclusion that
γ−α 2
revenues and expenses tend towards their targets (t∗ ; g ∗ ) on the infinite horizon.
Note that leaving the assumptions made by Loué Jean-François and Jondeau
Eric on the unsaturation of budget constraints and recipes, calculations become
complicated and the problems can only be solved numerically.
2.2 Numerical Experiments

For simulations, we take as basic data: the growth rate n = 7%, the interest
rate r = 11%, the discount rate τ = 7% the initial debt b−1 = 12%, the initial
revenue t−1 = 31%, initial expenses g−1 = 43%. The target of expenditure is
g ∗ = 47%, the revenue target is t∗ = 50%, ginf = 40% and tsup = 60%.
The Government may choose to determine the optimal forecasts by semester,
quarter, month, week or day of the year. The discretization of time will depend
on the chosen option. For a quarterly option, the number of variables in each
problem type is N v = 8 and the number of constraints in the problems № 1 and
4 is N c = 8 and for the other problems, N c = 9. In the case of a monthly option,
N v = 24 and N c = 24 for the problems № 1 and 4 and N c = 25 for other. For
a daily option, N v = 730.
In this study, we present here only the monthly option and daily option.
For each problem, the vector topt is composed of the optimal recipes and gopt is
composed of optimal expenses for each given period of the year. For a government
that chooses the revenue vector topt and the expense vector gopt , it will incur
an annual cost ε such that ε2 = f (topt ). We define a periodic average cost by
εm = N2εv . The following tables show the results for a monthly option where time
is the time elapsed of the numerical resolution and the curves of data evolutions.
Problem № 1
topt = [51.494, 50.984, 50.153, 49.384, 49.267, 51.475, 48.266, 50.724, 51.266,
50.613, 49.225, 49.790];
gopt = [44.616, 45.419, 45.175, 44.897, 44.740, 45.420, 46.748, 47.082, 43.011,
44.722, 45.059, 45.170];
ε2 = 30.517; εm = 0.460; time = 2.260 s.
Problem № 2
topt = [48.656, 48.844, 48.574, 48.368, 48.629, 48.614, 48.947, 48.373, 48.840,
48.450, 48.783, 48.520];
gopt = [48.702, 48.982, 48.393, 48.472, 48.798, 48.567, 48.762, 48.467, 48.695,
48.631, 48.507, 48.613];
ε2 = 26.207; εm = 0.426; time = 1.534 s.
Problem № 3
topt = [48.568, 48.364, 48.329, 48.880, 48.904, 48.636, 49.073, 48.585, 48.724,
48.838, 49.002, 48.624];
gopt = [48.814, 48.820, 49.039, 48.591, 49.014, 48.889, 48.665, 48.648, 48.233,
48.459, 48.534, 48.729];
ε2 = 26.735; εm = 0.430; time = 2.616 s.
Problem № 4
topt = [45.099, 50.797, 50.512, 51.837, 51.251, 51.267, 52.281, 50.864, 48.412,
50.105, 50.482, 49.495];
gopt = [44.806, 45.161, 45.689, 44.587, 44.725, 46.276, 45.995, 43.831, 44.266,
42.305, 44.079, 45.076];
ε2 = 33.984; εm = 0.485; time = 4.244 s.
Problem № 5
topt = [42.965, 46.971, 47.602, 48.367, 48.692, 47.647, 47.996, 49.729, 49.295,
49.079, 49.781, 48.699];
gopt = [46.848, 47.105, 47.533, 48.464, 48.419, 48.546, 48.979, 49.119, 46.914,
47.756, 48.267, 48.490];
ε2 =12.508;εm = 0.294; time = 1.582 s.
Problem № 6
topt = [42.934, 48.117, 48.816, 49.034, 48.272, 49.076, 48.987, 49.043, 49.434,
49.200, 49.572, 48.871];
gopt = [47.206, 47.948, 48.703, 48.959, 49.171, 49.157, 48.473, 48.993, 47.893,
47.374, 48.333, 48.768];
ε2 = 13.456; εm = 0.305; time = 1.779 s.
For the daily option: the Government can decide to evaluate for each day the
optimal level of its revenues and the optimal level of its expenses.
Recall that the size of the problem here is such that the number of variables
N v = 730 and the number of constraints is N c = 730 for the problems № 1 and
4 and N c = 731 for other. All solutions obtained are eligible, because of the
large size of the data topt and gopt (vectors of 365 size each). We will present only
the costs borne by the government, time elapsed of numerical resolution and the
curves of data evolutions for each type of problem. The results are presented in
the Table 1.
The results obtained by manual calculation (theoretical results) under
assumption in [4] that the revenues will never reach their fixed higher level and
that the expenditures will always be beyond their lower fixed level (h1s (x) < 0
for all s ∈ I), show in fact that the optimal choices of the proposed government
by the model tend towards the objective aimed by the latter (see Eqs. (29) and
(30)). This assumption can be quickly influenced by natural phenomena that
are sometimes unpredictable. Also, in the countries of the third world where the
840 J. Koudi et al.
Table 1. Simulations results of problems 1–6
Problem f (xopt ) = ε2 Time (min) εm

№1 699.783 32.894 0.072
№2 788.219 22.347 0.077
№3 384.321 43.970 0.053
№4 384.321 16.135 0.053
№5 331.647 35.047 0.049
№6 327.664 33.842 0.049
budget is mainly based on the tax, the receipts are often lower than the hoped-
for level and to avoid to indulge more advantage the government must reduce its
expenses to the lower level fixed in certain period of his exercise. It is therefore
preferable to place oneself in the cases where the receipts can reach their fixed
higher level or not and that the expenses can reach their lower fixed level or
not. The results in these different cases are those of digital tests. Comparing
the results obtained in the case where the government has an adjustment cost
in relation to the case where it does not have one, we immediately note that
the choices are concentrated around the adjustment and converge towards the
fixed objective which is not the case in situations where the government does
not have a cost of adjustment. We note a high concentration of data around an
adjustment than in the other cases. This shows that in practice, the choice of
this option will minimize the fluctuations on the economy of the country.
3 Conclusion
The results obtained are satisfactory and reflect the reality. The goal of the opti-
mal management model used in this work is to find solutions that tend towards
objectives with lower-cost in the presence of an adjustment cost. This is what
the manual results (see Eqs. (29) and (30)) showed us. In the same way, numer-
ical resolutions have given us the same types of results and demonstrate the
difference between no cost-adjustment solutions and cost-adjustment solutions.
It should also be noted that the fourth case of intertemporal budget constraint
that we created in this work has further improved the results. The results of
these problems are more mixed around the adjustment than in the other cases.
So this assumption decreases fluctuations on the economy. All these results show
that the algorithms have a good ability to solve large problems.
References
1. Jean K., Guy D., Babacar, M.N., Mamadou, K.T.: Algorithms for asymptotically
exact minimizations in Karush-Kuhn-Tucker methods. J. Math. Res. 10(2) (2018).
https://doi.org/10.5539/jmr.v10n2pxx
2. Guy, D., Jean, K.: Les multiplicateurs de Lagrange en dimension finie, Edition EUE
(Novembre 2013)
3. Barro, R.J.: On the determination of the public debt. J. Polit. Econ. 87(5) (1979)
4. Jean-François, L., Eric, J.: La gestion optimale des finances publiques en présence
de coûts d’ajustement, Economie & prévision, No 104, 1992-3. Politique budgétaire,
taux d’intérêt, taux de change, pp. 19–38 (1992). https://doi.org/10.3406/ecop.
1992.5292
5. Roubini, N., Sachs, J.D.: Political economic determinants of budget deficit in the
industrial democracies. Eur. Econ. Rev. 33 (1989)
Features of Administrative and Management
Processes Modeling
Ryskhan Satybaldiyeva(&) , Raissa Uskenbayeva,

Aiman Moldagulova , Zuldyz Kalpeyeva, and Aygerim Aitim
International Information Technology University, 050040 Almaty, Kazakhstan

r.satybaldiyeva@iitu.kz, uskenbaevar@gmail.com,
aiman.moldagulova@gmail.com,
zhuldyz.kalpeeva@gmail.com, aigera_tg@mail.ru
Abstract. Business processes are created for gaining profit, i.e. they produce
the added value and the product, which represents the value and consumer
qualities. The objectives of administrative and managerial processes of gov-
ernment bodies are somewhat different. Administrative and management pro-
cesses at the state level are mainly focused on controlling access to an activity.
The purpose of this article is to examine the features of administrative and
management processes in terms of requirements. The article dwells upon the
process approach in government bodies on the example of university.
Descriptions of administrative and management processes are given on the
example of scientific activities’ licensing. The systems of standardization of
administrative and management processes, including a description of approa-
ches to the creation of process models are presented. This considers the specifics
of processes at university. In addition, presents the models of university pro-
cesses in notations IDEF0, IDEF3, eEPC.
Keywords: Business process Administrative processes Management

processes University management Accreditation ARIS IDEF0
Modeling business process
1 Introduction
Nowadays it has become very difficult to conduct activities of the companies and
government bodies in the context of globalization of processes and market volatility.
This has led to new different management concepts. Among such concepts, the most
promising is the process approach to management or the process management of
companies and government bodies. Subsequent paragraphs, however, are indented.
According to this concept, the activity of a company and government bodies is
represented as a set of processes (divided into separate processes and a network of
processes is created), each of which is functionally autonomous at a certain level but
interconnected by the subject of work functionally and informationally [1].
Accordingly, business process models are still very simplistic - a business process
involves a sequence of chains of actions or operations leading to a goal. Such a
simplified business process model is not adequate to the actual processes and all actions

https://doi.org/10.1007/978-3-030-21803-4_84
Features of Administrative and Management Processes Modeling 843
required for production of valuables: products and services. The consequence of a

simplified view of the business process is low productivity, high production costs, poor
controllability, flexibility, agility and reliability of the operation of business process.
It should be noted that well-known business process management methodologies
and technologies are focused on automating existing business processes, improving the
efficiency of management methods and systems, and the operation of business pro-
cesses [2]. All these factors require a systematic study of the mechanisms for building a
business process and operation, including a more in-depth analysis and synthesis of
business process components with further definition of effective methods for con-
structing formal models describing various aspects of business process.
Business processes in government bodies are administrative and management
processes. Today, mainly administrative and managerial processes in government
bodies are built and automated intuitively and subjectively. There are no theoretical
basis and a coherent methodology, methods and tools for building unified and dynamic
business processes. It should be noted that now there are no precisely verified and
developed universal methods allowing to help any “sick” enterprise in solving the
issues of management system optimization [3].
This article is devoted to building a business process model at the International
Information Technology University (IITU). IITU is a relatively young university. It
was founded in 2009. Despite this, it has already become well-known in the labor
market.
2 Modeling of Administrative and Management Processes

on the Example of University Business Processes
When switching to a process-oriented management system, an organization receives a

single, flexible and universal organization management system [3]. However, it should
be remembered that a process-oriented management system will suit and bring tangible
benefits to organizations that exist in a dynamic and actively developing competitive
market. It is advisable to introduce such a model of management in universities [4].
Modern university management means activities aimed at implementing business
processes with the highest efficiency under certain restrictions (human, material,
intangible and financial resources).
In order to move to process management, it is necessary to formalize all business
processes, understand which of them are necessary, how they are organized, and how
to control their effectiveness.
The task of formalization is always defined in the life of an organization’s standards
system. Under the standardization of processes, we understand the complex processes,
methods and elements of the organizational structure that ensure the timely develop-
ment, commissioning, performance monitoring and timely elimination of regulatory
and methodological documents of organizations.
The system of standardization of business processes includes a description of
approaches to the creation of process models. There is a specificity to the description of
processes at the university [5, 6].
844 R. Satybaldiyeva et al.
2.1 Tools for Business Modeling and Analysis

The tasks of modeling the administrative, managerial and business processes are solved
by business modeling and analysis tools. There are quite a few such tools in the world,
and they all have different functionality and use different methodologies [7, 13].
To select the desired tool let’s consider the following requirements:
• visibility, because the model will be used by users who are not dedicated to the
notation so the model should be intuitive;
• completeness of modeling, i.e. clear rules for dividing the model according to levels
of detail, interaction and interrelation between them;
• support for different standards of creating individual diagrams;
• ability to support the requirements of management processes;
• ability to generate reporting documentation based on the created model (job
descriptions, administrative regulations and the main content of a technical task);
• security for conducting a certain kind of model analysis.
Regarding the task of building the models of administrative processes and the
requirements one can examine ARIS, IDEF0 and UML methodologies. The ARIS
methodology is integrated into the ARIS (Architecture of Integrated Information
Systems) tool developed by the German company IDS Scheer. ARIS supports four
types of models: organizational, functional, information, and management models. The
core business model of ARIS is eEPC (extended Event-driven Process Chain). The
eEPC notation is an extension of the IDEF3 methodology.
2.2 Description of the Process Environment and Creation of the “As Is”
Process Model
The main stages of building a business process model at work are the definition of roles
and business functions [7]; binding roles to business functions; determination of the
order of business functions’ execution; adding events, documents and resources. At the
stage of modeling, the following results should be obtained: Process Map, Role Dia-
gram and an “As Is” Model of each considered business process [13].
The process map representing the connection between the various administrative
and management processes of university and their interaction is shown in Model
IDEF0 (see Fig. 1). The process map shows the main processes and connections
between them (for example, the dependence of one process on another, or the
replacement of one process by another when a certain condition is met). It also presents
various documents that are passed from process to process or regulate their course
(standards, instructions, etc.). General activity of the university consists the following
processes: Management and organization of educational processes, Financial and
Economic Management, Implementation of ate administrative procedure of adminis-
trative and economic activities based on the Quality Management System (QMS),
Research activities, Educational activities. In turn, the process of research activities are
divided into the following subprocesses: Planning of research activity of university,
Organization of research activity of university, Doing of research activity of university,
Control of research activity of university [12]. Figure 2 shows the chain of listed
subprocesses in eEPC notation. Despite the small faculty and researchers of MUIT
implements this model. At the moment, the university has a license to conduct scientific
activities. In Kazakhstan, organizations carrying out research and (or) research and
technical activities must undergo an accreditation procedure for research activities once
every 5 years.
Fig. 1. Model IDEF0 “Process model of university management” (adapted [8])
The “as is” model of each considered administrative and managerial process,
describing the process and reflecting its course, actions, roles, movement of documents
and points of possible optimization.
The “as is” model represents a process in the form of a single action (that is not
disclosing the course of the process), for which an event triggering process, necessary
input data, result, interrupting events, compensating processes, regulatory documents
and related business goals can be shown [11].
Administrative and management processes at the state level are mainly focused on
controlling access to a activity. On the other hand, the university as an organization
carrying out scientific and technical activities is a subject to accreditation [10].
Licensing as one of the forms of access control to an activity and the process of
granting licenses are the most common type of administrative and management process
[14]. The study of automated services of the electronic government of the Republic of
Kazakhstan revealed the absence of an automated service “Accreditation of subjects of
scientific and technical activities” (State Service Code No. 00802002 [9, 10]).
Fig. 2. Process map “Research Activities of the University” in eEPC notation
By the time of work on the detailed modeling of a state service “Accreditation of

subjects of scientific and technical activity”, the roles were created and a process map
was agreed. The “As Is” administrative process model was built on their basis here the
model was divided to three part (see Figs. 3, 4, 5). The following roles were identified:
rector responsible for accreditation, vice-rector for academic affairs, department heads
and directors.
Fig. 3. The eEPC model of business processes “Accreditation of subjects of scientific and
(or) scientific and technical activities” part 1
3 Conclusion
The paper discusses the approach to improving the management business processes of
an enterprise for which automation and information systems act as the main means. To
do this, it is proposed to investigate the processes of the university to obtain licensing.
The results of practical research allowed to determine the sequence of work on building
models of business processes in the implementation of the management process at the

university. There is a specificity to the description of the processes at the university, as
academic institutions. At further stages of the research, there will be an improvement in
the constructed model and the development of the reference model “To Be”. Devel-
opment of recommendations and the formation of proposals for the optimization of
existing business processes.
Acknowledgments. This work is supported by Ministry of Education and Science Republic of

Kazakhstan (Grant No. AP05134071).
References
1. Saarsen, T., Dumas, M.: Factors Affecting the Sustained Use of Process Models. Business
Process Management Forum, pp. 193–209 (2016)
2. Pudovkina, S.G.: Analiz i optimizatsiya biznes-protsessov: uchebnoye posobiye., Chelya-
binsk: Izdatel’skiy tsentr YUUrGU (2013)
3. Varzunov, A.V., Torosyan E.K., Sazhneva L.P.: Analysis and Management of Business
Processes. ITMO University, St. Petersburg (2016)
4. Bedrina, S.L., Bogdanova, O.B., Kiykova, Y.V., Ovsyannikova, G.L.: Modeling of business
processes of higher education institution at introduction of process management. Open Edu.
1(102), 4–11 (2014) (In Russia)
5. Badica, A., Ionascu, C., Radu, C.: Elicitation of business process knowledge: a university
use case. In: Proceedings of the 7th Balkan Conference on Informatics Conference, Craiova,
Romania (2015). https://doi.org/10.1145/2801081.2801120
6. Klimenko, A.V.: Razrabotka metodicheskikh rekomendatsiy po opisaniyu i optimizatsii
protsessov v organakh ispolnitel’noy vlasti v ramkakh podgotovki vnedreniya EAR.
Vysshaya shkola ekonomiki, Moscow (2004)
7. Samuylov, K.Y., Chukarin, A.V., Yarkina N.V.: Biznes-protsessy i informatsionnyye
tekhnologii v upravlenii sovremennoy infokommunikatsionnoy kompaniyey. Alpina Pab-
lisher, Moscow (2016)
8. Asadullin, I., Samigullina, N., Zamaletdinov, R.: Optimizatsiya upravleniya vysshim
uchebnym zavedeniyem na osnove protsessnogo podkhoda. Rektor VUZa 8 (2015)
9. Reyestr gosudarstvennykh uslug Government of the Republic of Kazakhstan. http://ru.
government.kz/ru/postanovleniya. Last accessed 2 Mar 2015
10. Ob utverzhdenii Pravil akkreditatsii subyektov nauchnoy i (ili) nauchno tekhnicheskoy
deyatelnosti Government of the Republic of Kazakhstan. http://ru.government.kz/ru/
postanovleniya. Last accessed 3 Jun 2016
11. Samuilov, K., Serebrennikova, N., Chukarin, A., Yarkina, N.: Fundamentals of Formal
Methods for Describing Business Processes, vol. 1. RUDN, Moscow (2008)
12. Kovalova, M., Turcok, L.: The importance of business process modelling in terms of
university education. Int. J. Sci. Technol. Res. 3(12), 111–117 (2014)
13. Kamennova, M.S., Krokhin, V.V., Mashkov, I.V.: Business Process Modeling. Part 1:
Textbook and Practical Work for the Academic Bachelor Degree, vol. 1. Publisher Jurajt,
Moscow (2018)
14. Salikhzyanova, N., Gallyamova, D.: Metodologiya modelirovaniya biznes-protsessov
organizatsii. Vestnik Kazanskogo tekhnologicheskogo universiteta 15(5), 202–204 (2012)
Optimization Problems of Economic
Structural Adjustment and Problem of
Stability
Abdykappar Ashimov(B) , Yuriy Borovskiy , and Mukhit Onalbekov
Kazakh National Research Technical University named after K. Satpayev, 22

Satpayev str., 050013 Almaty city, Kazakhstan
ashimov37@mail.ru, yuborovskiy@gmail.com, mukhon@list.ru
Abstract. The paper emphasizes the following problems: stability of

mapping defined by the dynamic model in numerical optimization prob-
lems based on this model and numerical methods for evaluating such
stability. The use of such numerical methods is illustrated in solving prob-
lems of structural adjustment and economic growth of countries belong-
ing to an economic union. The solution of these problems is based on the
developed dynamic multi-country computable general equilibrium model
describing the functioning of nine regions, including five countries of the
Eurasian Economic Union (EAEU). The initial data of the model contain
the sets of consistent social accounts matrices (SAM) for the historical
and forecast periods built based on the following data: GTAP database,
national input-output tables, international trade, and IMF forecasts for
the main macroeconomic indicators. Using the proposed numerical meth-
ods, the stability of the mappings of the exogenous parameters values of
the calibrated model into the values of its endogenous variables was esti-
mated. The approach of selecting promising sectors of the economy of
the EAEU countries was proposed and implemented. Based on the model
a number of parametric control problems were formulated and solved for
evaluating the optimal values of fiscal policy instruments both at the
level of individual EAEU countries and the EAEU as a whole. These
problems are aimed at structural adjustment and economic growth by
stimulating the growth of output of selected promising sectors of the
EAEU countries.
Keywords: Stability of smooth mapping · Structural adjustment ·

Theory of parametric control of macroeconomic systems ·
Computable general equilibrium model
1 Introduction
As is known, solutions optimization in problems of real objects of different nature
represented by static or dynamic models and, in particular, in macroeconomic
modeling [1,2], is widely distributed. In mathematical formulation, such an opti-
mization problem is usually represented as a criterion—a function or functional
https://doi.org/10.1007/978-3-030-21803-4_85
Optimization Problems of Economic Structural Adjustment 851
and constraints on endogenous variables based on the model, as well as con-

straints on optimizing arguments [3].
Various numerical methods are widely used [3] in order to solve these opti-
mization problems. In numerical methods of solving an optimization problem
based on a static or dynamic mathematical model, arbitrarily small changes
in the smooth mapping of its exogenous variables (including optimizing argu-
ments) to the solutions (including criteria) of this model can lead to changes in
the qualitative properties of this mapping [4]. This situation of instability of the
specified mapping does not correspond to the properties of the simulated real
object. Therefore, there is a problem of the validity of the transferring the results
of computational experiments to the corresponding described subject areas [5].
Approach to the theory of parametric control (TPC) was proposed [6] to
solve this problem. Methods for the numerical evaluation stability of differen-
tiable mappings defined by the model for cases of immersion, submersion and
submersion with fold were proposed within the framework of TPC. Numerical
methods for estimating some stability indicators of the indicated mappings [7]
were also proposed in [6]. For the model, which is an autonomous dynamical
system, numerical methods for estimating its weak structural stability were pro-
posed in [6]. This property characterizes the preservation of the qualitative char-
acteristics of the phase portrait of the system with sufficiently small perturba-
tions of its vector field. The use of these methods for assessing the stability and
stability indicators of mappings is demonstrated in this paper.
A number of papers reviewed the assessment of the results of various struc-
tural adjustment policy scenarios using appropriate macroeconomic modeling [8–
14]. But in this and other well-known literature, the optimization problems of
economic restructuring of the regional union’s countries were not considered and
no estimates of the mappings given by the model were used.
In this paper, based on the developed dynamic computable general equilib-
rium model (hereinafter referred to as the Model), the TPC approaches were
illustrated in solving optimization problems of parametric control aimed at
restructuring and economic growth in the EAEU countries. The Model and the
results obtained on its basis were successfully tested for the possibility of their
practical application using the stability assessment of the mapping defined by
the Model and the stability indicators for such mapping.
2 The Model
The dynamic Model was built by developing and linking to the data of the static
computable general equilibrium model (CGE model) Globe1 [1] on the base of
development of a conceptual description of the global economy.
First, we list some of the prerequisites for a meaningful description.
The world economy is presented in the form of the functioning of interact-
ing agents of selected Regions of the world economy: Producers (industries),
Households, The State. Agent-region Globe imports transportation services and
exports them to all regions when importing each type of goods from each region
to every other region.
852 A. Ashimov et al.
Conceptual description of the world economy contains statements of a num-

ber of optimization problems for agents with corresponding first-order condi-
tions, other equations describing the functions of agents, balance ratios for prices
and quantities (real indicators measured at seller prices), internal balances on
government accounts and external balances on trading accounts.
In the conceptual description, a system of composite endogenous prices is
used to ensure the fulfillment of annual balance ratios in the factor (labor and
capital) markets and of each type of product; two-sided balance of payments
for each pair of Regions; balance of savings (Households, the States) and their
investments in the industry of the Regions.
The conceptual description compared to the basic version of Globe1 [1], was
developed by describing the following variables using dynamic equations: tech-
nological coefficients of production functions for gross value added (GVA) of
all Regions’ Industries, Factors’ supply (labor and capital) by Households of
Regions.
Mathematical Model built based on its conceptual description was obtained
as a result of combining into one system the equations describing the first-order
conditions of all optimization problems of agents and other rules of activity
(behavior) of agents. Balance and auxiliary equations are also included in the
mathematical model.
All equations of the mathematical Model were loaded into the GAMS envi-
ronment [15] as part of the main module of the Model. The solution of the
calibrated system of equations of the Model (Model calculation) is performed
using software implemented in the GAMS integrated development environment
using the embedded PATH solver [16].
The developed Model describes the economy of the following 9 Regions: Kaza-
khstan, other members of the Eurasian Economic Union (Russia, Belarus, Arme-
nia, Kyrgyzstan), as well as the main trading partners of Kazakhstan (European
Union (as one country); the USA; China and the Rest of the world (as one coun-
try)).
The economy of each Region of the Model is described by the following 16
sectors that are the most significant for the economies of the EAEU countries: 1.
Mining operations (except oil and gas) – ming; 2. Mining of crude oil and gas –
crog; 3. Metalworking production and mechanical engineering – mepe; 4. Metal-
lurgical production – mind; 5. Education, Health service, public administration
– ehas; 6. Production and transmission of electricity, gas and hot water – pegw;
7. Manufacture of food products, beverages and tobacco – f pin; 8. Professional,
scientific and technical activities – psta; 9. Other industries – otis; 10. Other
services – oths; 11. Agriculture, forestry and fisheries – agf f ; 12. Construction
industry – buil; 13. Production of textiles, clothing, leather and related products
– mtal; 14. Financial service – f ins; 15. Chemical and petrochemical industry –
chpp; 16. Transport – tser.
The core of the Model’s database consists of the sets of matched social
account matrices (SAMs) of the Regions for each year under consideration (2004–
2023). The mentioned SAM sets for 2004, 2007, and 2011 were extracted using
a special converter from the GTAP database [17]. The required SAM sets for
the years 2005, 2006, 2008–2010 and 2012–2015 were calculated using the devel-
oped Algorithm 1 [18] based on the available statistical sources containing the
input–output tables (see, e.g., [19], and indicators of mutual trade [20], using
the base ratios calculated with the help of the known SAMs for the most recent
year (2004, 2007 or 2011). For the forecast period (2016–2023), the developed
Algorithm 2 [18] was used to calculate these SAM sets based on the following
forecast indicators of the Regions provided by the IMF [21]: GDP, Total invest-
ment, Import volume, The volume of import of services, The volume of exports
of goods, The volume of exports of services, General government revenues, and
General government expenditure. In doing so, we used the baseline ratios calcu-
lated with the help of the obtained SAM for 2015.
The results of calculating the base scenario of the calibrated Model accurately
reproduce the statistical and forecast data to be used in building the SAM sets
indicated above.
3 Evaluation of the Implementation of Transfer

Conditions of Computational Experiments Results
to the Practice
3.1 Estimates of the Stability of Smooth Mappings Defined by the

Model
The availability of the stability property of a smooth mapping f : A → B, which

transfers the values of the exogenous parameters p ∈ A into the solutions (the
values of endogenous variables), suggests preserving the qualitative properties
of such mapping with its small (see [4]) changes. The monograph [4] contains a
number of theorems that give sufficient conditions for such stability in cases of
immersion, submersion, and submersion with a fold. For a numerical evaluation
of the fulfillment of these theorems’ conditions, the authors proposed a corre-
sponding set of numerical algorithms [6]. The composition of this set includes
the following algorithms.
(1) Algorithm 1 for estimating the set S(f ) of singular points of the mapping
f in parallelepiped A. This algorithm is based on dividing parallelepiped A
into small elementary parallelepipeds and evaluating the signs of the max-
imal order minors of the Jacobi matrix of the mapping f at all vertices of
elementary parallelepipeds. In cases where dim A ≥ dim B and when the
set S(f ) is estimated as empty, this mapping f is evaluated as a stable
submersion.
(2) Algorithm 2 for estimating the injectiveness of the mapping f . This algo-
rithm is used in the case when dim A < dim B and when the set S(f ) is
estimated as empty. If the application of Algorithm 2 evaluates the immer-
sion f as an injective mapping, then the mapping f is evaluated in A as
stable.
(3) Algorithm 3 for estimating the mapping f as a submersion with a fold.

This algorithm is used in the case when dim A ≥ dim B and when the set
S(f ) is estimated as non-empty. Algorithm 3 is based on the verification of
compliance with the following:
– conditions rank(f ) = dim B − 1 for all singular points of the mapping f ;
– transversality conditions for a 1-stream of the mapping f and a subset
consisting of 1-streams of corank 1 in the space of all 1-streams of smooth
mappings from A to B;
– conditions that the dimension of the sum of the tangent space to the
manifold S(f ) and the kernel of the tangent mapping df at all points
S(f ) coincides with dim B.
(4) Algorithm 4 for estimating the mapping f as a stable submersion with a
fold. This algorithm is used in the case when dim A ≥ dim B and when the
set S(f ) is estimated as a fold. Algorithm 4 is based on an estimate of the
injectiveness of the mapping f restriction to its fold S(f ).
The above mentioned algorithms were implemented as a software module

in the GAMS environment, which allows to evaluate the stability determined
by the estimated Model of the mappings f for various scenarios (including the
optimal ones, discussed further in Sect. 4.2).
The mappings of the form f : A → B with dim A = 5 and dim B = 9 were
considered as example in the experiments for the baseline scenario where the
arguments of the mapping f were taken tax rates on value added of five EAEU
countries (Kazakhstan, Russia, Belarus, Armenia, Kyrgyzstan) for 2015, and
as output variables of the mapping f , the GDP values of all 9 regions of the
Model for 2023. The boundaries of the five-dimensional box A, with the center
at point p = (p1 , . . . , p5 ), corresponding to the baseline values of the specified
tax rate are distanced from the value pi to the value of 0.5pi . It should be noted
that the time to calculate the implemented algorithms for the mapping stability
estimation increases roughly exponentially with the increase in the dim A. This
limits the use of such an approach. Thus, to obtain a reasonable calculation
time, the set of the most important factors used in the solution based on the
Model of specific problems for macroeconomic analysis and parametric control
was selected.
The results of the specified numerical experiments demonstrated the absence
of singular points of the mapping f in the A box and the stability of this immer-
sion.
3.2 Estimates of the Stability Indicators of Smooth Mappings

Defined by the Model
The βf (p, α) stability indicator is defined by the Model mapping f : A → Bt
at the p ∈ A point and for the selected positive α, the value is the diam-
eter of the image (when f mapping) of the ball with its radius α and with
its center at the point p (in relative terms). If the numerical assessment of
βf (p) = limα→0 βf (p, α) value is uniformly close to zero for all the p ∈ A, then
the f mapping defined by the tested model is assessed on the A set as being con-
tinuously dependent on the exogenous values [6,7]. In the Model experiments as
set A, a parallelepiped with a center at point p corresponding to the baseline
values of all the tax rates in all the Regions for the year 2016 was considered,
whereas as Bt sets of endogenous variables were considered the following: GDP,
exports and imports of all the Regions of the Model for the fixed computational
year t (from 2016 to 2023).
Table 1 shows the calculated values of the βf (p, 0.01) of the Model stability
indicator (in percentage) for the base point p and α = 0.01.
Table 1. Values of the stability indicators for the basic calculation of the model.
Year
2016 2017 2018 2019 2020–2023
0.7652 0.2496 0.0259 0.0033 0.0000
The specified estimates of the stability indicators in the Table 1 indicate that
the stability of the Model (in the sense of the considered stability indicators) in
the calculations up to 2023 is sufficiently high.
3.3 Implementation of the Counterfactual and Forecast Scenarios
According to the well-known macroeconomic theory, the reduction of taxes levied

on producers and consumers, as well as the increased state demand for consumer
products increase the country’s output and GDP. Here, counterfactual and fore-
cast scenarios were calculated based the Model to assess the implementation of
the provision of the theory. Specifically, the scenario was performed featuring a
10% decrease in the effective tax rates of value added tax and, tax on the pro-
ducers’ income, and a 10% increase in government consumption in each EAEU
country. The results of the demonstrated changes in the GVA in each sector in
the relevant country (ranging from −3.85% to 6.16%) and increases in GDP in
each Region ranging from 0.0279% in 2009 for global economy to 0.7715% in
2012 for EAEU, compared with the observed data.
The above results of the three test methods demonstrate the successful veri-
fication of the tested Model.
4 The Solution of the Problem of Structural Adjustment
Within the framework of developing optimal measures of fiscal policy for struc-
tural adjustment at the sectoral level and economic growth of the EAEU coun-
tries, and based on the Model the following steps were proposed, that allowed
to:
1. select a set of promising industries for each EAEU country for which it is
desirable to have an outstripping growth of output; and
2. solve a set of dynamic optimization problems aimed at economic growth
and accelerated growth of output of each of the selected promising sectors in the
EAEU countries and in the EAEU as a whole.
4.1 Selection of the Promising Industries
The marginal cost of public funds for taxes from this industry (M CFr,i ) for
the forecast period (see also [10]) is proposed as an indicator characterizing the
prospects of each i-th industry of the country r. In this paper, the amount of
change in the r country’s GDP resulting from an increase in tax collections from
the i-th industry by 1 monetary unit is adopted as the M CFr,i . This indicator
characterizes the significance of the industry, such that increasing its taxation
leads to the increase in the country’s GDP. The results of the calculation of
M CFr,i indicator are shown in the Table 2.
Table 2. M CFr,i indicators for 2016–2023.
Industry i Country r
1 Kazakhstan 2 Russia 3 Belarus 4 Armenia 5 Kyrgyzstan
1 ming 0.40 0.26 0.37 1.71 0.54
2 crog 0.51 0.08 −1.50 2.27 −1.01
3 mepe −0.58 0.23 0.10 1.39 0.52
4 mind 0.92 0.41 1.01 1.82 0.89
5 ehas 0.32 0.63 0.27 0.75 0.83
6 pegw −0.33 −0.27 0.33 0.32 −0.02
7 f pin −3.14 −1.26 −0.60 −0.66 −0.89
8 psta −0.24 −0.00 1.17 1.33 0.31
9 otis −1.41 −0.06 0.17 0.62 −0.47
10 oths 0.27 −0.34 −0.19 0.29 0.22
11 agf f −1.97 −0.77 −0.51 −0.66 0.20
12 buil 0.62 0.89 0.67 0.48 1.02
13 mtal −0.92 −1.03 0.53 0.75 0.37
14 f ins −0.22 −0.23 0.16 0.88 0.07
15 chpp −0.85 −0.14 1.39 1.25 0.94
16 tser 0.40 −0.20 1.00 2.10 1.39
Based on the analysis of the values shown in Table 2, the sets of industries
corresponding to the M CFr,i indicators, that are not less than 0.4 for each of
the r country of the EAEU, have been identified (in bold). The set of numbers
i of the selected promising sectors of the country r is denoted by Ir : I1 =

{1, 2, 4, 12, 16}, I2 = {4, 5, 12}, I3 = {4, 8, 12, 13, 15, 16}, I4 = {1–5, 8, 9, 12–16},
I5 = {1, 3–5, 12, 15, 16}.
4.2 Setting and Solving the Parametric Control Problem
The set of problems of dynamic optimization (i.e., the problem of SPr parametric
control) was considered in solving the problem of economic growth of the EAEU
countries and accelerated growth in the output of selected industries in these
countries in 2016–2023 via fiscal policy measures. Here r = 1, . . . , 5 corresponds
to such a problem at the level of one country r; r = 0 corresponds to the problem
of developing a coordinated policy at the level of all EAEU countries. We give
an informal formulation of this problem.
Statement of the SPr problem. The problem deals with the identification of
the values of the control parameters ur (t) based on the Model (effective tax rates
on producer’s income, sales tax and customs duties differentiated by product and
industry; the share of government expenditures for consumption) that provide
the maximum value of the Kr criterion (1)–(2) with constraints on the control
instruments of ur (t) ∈ Ur (t) type. Here, for t = 2016, . . . , 2023; and Ur (t) is a
parallelepiped with the center at the point of base values ur (t) and boundaries
spaced at ±10% of the baseline values.
For the problems SPr , (r ∈ {1, . . . , 5}) the control parameters ur (t) are the
specified instruments of the state policy of only the r-th country, and for the
problem SP0 the control parameters are the indicated instruments of the state
policy of all five EAEU countries.
In each of the problems SPr it is supposed to conduct an optimal fiscal
policy aimed at economic growth and the growth of output of selected promising
industries in 2016–2023. (in the r country for the case r ∈ {1, . . . , 5} or in all
EAEU countries for the case r = 0). Therefore, in these problems, the criterion
Kr is written in the following form

2023
Kr = TQVAr (t) + αr,i TQX r,i (t) , r ∈ {1, . . . , 5}; (1)
i∈Ir
t=2016

5
K0 = Kr , (2)
r=1
where TQVAr (t) is the GDP per capita rate of the country r and TQX r,i (t)
is the rate of output of industry i per capita in the country r in year t and
αr,i = 0.1 is weighting coefficient.
The formulated problems SPr were solved numerically by using the solver
NLPEC [22]. The results of solving these problems in the form of increments
of the average value of GDP for 2016–2023 (in percentage compared with the
baseline scenario) are presented in Table 3.
Table 3. The increments of the average value of the GDP of the Regions as a result
of solving problems SPr .
Problem Country
SP1 1.76 0.00 0.00 0.00 −0.01
SP2 −0.01 2.76 0.00 −0.06 −0.03
SP3 0.00 0.00 3.18 0.00 0.00
SP4 0.00 0.00 0.00 1.52 0.00
SP5 0.00 0.00 0.00 0.00 1.94
SP0 3.00 2.14 5.86 4.50 9.27
Table 4. The increments of the average value of the GDP of the Regions as a result
of solving problems SPr .
Problem Country
1 ming 3.27 2.19 3.61 2.53 9.10
2 crog 2.52 0.98 0.00 0.73 0.00
3 mepe 1.56 0.03 0.80 0.63 4.25
4 mind 1.92 1.58 0.45 2.73 10.86
5 ehas 5.40 12.74 22.39 9.97 4.82
6 pegw 0.00 1.32 0.00 1.24 3.33
7 f pin 1.54 0.97 1.69 2.88 5.03
8 psta 1.93 5.09 11.99 7.90 0.00
9 otis 1.74 0.91 0.00 1.41 3.51
10 oths 0.00 0.00 0.00 0.25 5.61
11 agf f 1.85 0.00 0.00 1.37 0.00
12 buil 1.94 2.16 2.71 2.82 12.96
13 mtal 0.00 4.67 0.86 3.01 2.55
14 f ins 2.22 3.67 0.66 0.38 0.00
15 chpp 3.18 1.59 0.96 1.75 5.17
16 tser 11.35 0.00 0.00 1.50 6.02
Table 4 presents the increments of the average value of the output of the
industries of the EAEU countries for 2016–2023, obtained as a result of solving
the problem SPr in percentage compared to the baseline scenario. These values
for promising industries are shown in bold.
Scenario variants of the Model for the obtained optimal values of the instru-
ments were tested by three methods specified in Sect. 3. In all cases, the calcu-
lation results demonstrated:
the absence of singular points of the considered mappings in the correspond-

ing domains of their definition and the stability of these mappings;
permissible values of estimates of stability indicators of mappings;
compliance of the results of forecast scenarios for 2016–2023 with the main
provisions of macroeconomic theory.
The analysis of the highlighted values in the columns of the Table 2 shows that
within the problems SPr , the parametric control approach at the level of five
EAEU countries (Problem SP0 ) gives great effects for each individual EAEU
country compared to the parametric control at that country level (problems
SPr , r ∈ {1, . . . , 5}).
The results presented in this paper show the high potential of the parametric
control approach for assessing the stability of the mappings given by the Model
and developing recommendations for coordinated optimal state economic policy
at the level of the countries of the regional economic union.
References
1. GLOBE 1, http://www.cgemod.org.uk/globe1.html. Accessed 09 Nov 2018
2. GTAP Models Home, https://www.gtap.agecon.purdue.edu/models/default.asp.
Accessed 09 Nov 2018
3. Gill, P., Murray, W., Wright, M.: Practical Optimization. Academic Press, London
(1981)
4. Golubitsky, M., Gueillemin, V.: Stable Mappings and Their Singularities. Springer-
Verlag, New York (1973)
5. Arnold, V.I.: Geometrical Methods in the Theory of Ordinary Differential Equa-
tions. Springer-Verlag, New York (1988)
6. Ashimov, A., Adilov, Zh, Alshanov, R., Borovskiy, Yu., Sultanov, B.: The theory of
parametric control of macroeconomic systems and its applications (I). Adv. Syst.
Sci. Appl. 1(14), 1–21 (2014)
7. Orlov, A.I.: Econometrics. Ekzamen, Moscow (2002). [in Russian]
8. Abouharb, R., Duchesne, E.: World bank structural adjustment programs and their
impact on economic growth: a selection corrected analysis. In: The 4th Annual
Conference on the Political Economy of International Organizations (2011)
9. Zografakis, S., Sarris, A.: A multisectoral computable general equilibrium model
for the analysis of the distributional consequences of the economic crisis in Greece.
In: 14th Conference on Research on Economic Theory and Econometrics (2015)
10. Devarajan, S., Robinson, S.: The impact of computable general equilibrium models
on policy. In: Conference on Frontiers in Applied General Equilibrium Modeling
(2002)
11. Khan, H.A.: Using macroeconomic computable general equilibrium models for
assessing poverty impact of structural adjustment policies. In: ADB Institute Dis-
cussion Paper, vol. 12 (2004)
12. Shishido, S., Nakamura, O.: Induced technical progress and structural adjustment:
a multi-sectoral model approach to Japan’s growth alternatives. J. Appl. Input-
Output Anal. 1(1), 1–23 (1992)
13. Naastepad, C.W.M.: Effective supply failures and structural adjustment: a real-
financial model with reference to India. Camb. J. Econ. 26(5), 637–657 (2002)
14. Huang, H., Ju, J., Yue, V.Z.: A Unified model of structural adjustments and inter-
national trade: theory and evidence from China. In: Meeting Papers from Society
for Economic Dynamics, vol. 859 (2013)
15. GAMS Homepage, http://www.gams.com. Accessed 09 Nov 2018
16. Ferris, M., Munson, T.: PATH 4.7, http://www.gams.com/latest/docs/S PATH.
html. Accessed 09 Nov 2018
17. GTAP Data Base Homepage, http://www.gtap.agecon.purdue.edu/databases/
default.asp. Accessed 09 Nov 2018
18. Ashimov, A., Borovskiy, Yu., Novikov, D., Sultanov, B.: Macroeconomic analysis
and parametrical control of the regional economic union. URSS, Moscow (2018).
[in Russian]
19. World Input-Output Database Homepage, http://www.wiod.org/home. Accessed
09 Nov 2018
20. World Integrated Trade Solution Homepage, http://wits.worldbank.org. Accessed
09 Nov 2018
21. World Economic Outlook Databases Homepage, http://www.imf.org/external/ns/
cs.aspx?id=28. Accessed 09 Nov 2018
22. NLPEC, https://www.gams.com/latest/docs/S NLPEC.html. Accessed 09 Nov
2018
Research of the Relationship Between Business
Processes in Production and Logistics Based
on Local Models
Raissa Uskenbayeva, Kuandykov Abu, Rakhmetulayeva Sabina(&),

International Information Technology University, Almaty 050054, Kazakhstan

ssrakhmetulayeva@gmail.com
Abstract. It is impossible to fully display the all properties of a business

process with one monolithic model. This article proposes a basic model that is
integrated from the so-called local models. The purposes and essence of the
functions of local models (LM) and variants of the organization of the basic
(general) model (GM) from local ones are disclosed.
Keywords: Business process Business process basic (general) model

Business process local model
1 Introduction
In all economic, industrial and technological spheres (or processes) business processes
are the main objects, uniting everything that is relevant to achieving the goal. Manu-
facturing and logistic processes are specific business processes. They focus on the
routing of materials and all location of work to resources [1]. The business process can
be perceived as production and factors of production [2].
There are many models of business processes that do not sufficiently reflect the
properties of a business process and the needs of a person in a business process. In
other words, all kinds of models are functionally incomplete.
Business process analysis becomes extremely important for production and logis-
tics systems, since it plays a vital role in successfully improving business processes.
The purpose of process analysis is to discover new knowledge to solve problems and
streamline processes to create key competencies. A large amount of research and
development has been carried out to optimize the performance of business processes in
this complex and dynamic environment [4–6]. For the analysis and optimization of
business processes in the field of production and logistics, several methodologies,
methods and tools have been developed.
In this regard, the paper [1] argued that the basic business process model (GM or
MBP) is created by the composition of local models (LM). Thus, the properties of the
whole business process can be reflected and transmitted using combinations of local
models.
When organizing local models into a combination, not only is the union of func-
tions important, but also the organization of the structure, in other words, the nature of

https://doi.org/10.1007/978-3-030-21803-4_86
the relation between models and the type of protocol (universal or unique), also
important are questions of how integration takes place, especially such features of an
organization as: the type of technologies for integrating data, information, knowledge
and rules, services or agents, and the type of technologies of tools and interfaces are
especially important.
Standard integration tools can be used for integration, for example, such methods
and integration tools: ESB, EAI, EII, ETL, where the ESB literally translates as “en-
terprise service bus”, EAI—enterprise application integration, EII—enterprise infor-
mation integration, ETL—extract, transform and load - software for extracting,
transforming and loading data
Thus, the basic model is a composition of local models. At the same time, the
organization of local models in basic models depends on the characteristics of local
models and the environment, also on the characteristics of business processes and the
problem being solved in a business process.
The basic model has such properties as mono-target, hierarchy or peer-to-peer,
multidimensionality, semantic operations, non-linearity, fractality and fasset. These
properties arise in the composition by integration (based on the integration bus) and
aggregation of local models into a common integrated one.
Before outlining the main idea of the work, we first introduce and give a number of
concepts.
Definition 1. Model representation of a business process in a priori assumed that the
business process is an object of the external (or rather real or virtual) world, which
performs the function and role of a labor tool, performing assigned tasks or coming
from a production plan (operational calendar or schedule).
Definition 2. A complete business process is a process with all components, such as:
• schema logic (metamodel) of performing business process operations from abstract
classes.
• business process infrastructure, including various types of work tools, including
business process automation systems
Definition 3. Technological and model basics of business processes.

The construction, operation and maintenance of complex processes without the
model (ie, without MDA) is a difficult task. Therefore, the process of building business
processes, conducting on the basis of the model as a factor regulate the process of
building and maintaining operational processes of business processes is important.
Moreover, because of the multidimensional nature of business processes, many models
are required, but they are interconnected according to a certain attribute. The techno-
logical and modeling basics of logistic processes has the following features:
– the business process acts as an object of the outside world, therefore we accept the
Business process as an object of the outside world. He as any object suitable for
human activity must be observable;
– the business process plays the role of unifying, is a unifying tool of the means of
labor, the method of performing the task with the object of labor;
Research of the Relationship Between Business Processes 863
– during the execution of production tasks, the business process is used to perform
production tasks (or plan) as an instrument of labor that links the plan and opera-
tional management in the production environment, i.e. process of execution of the
plan. This shows that the business process itself as a method or technology of labor
must be observable and manageable;
– automation system refers to the infrastructure of a business process as one of its
components.
Definition 4. As a production (production and technological) link and instrument of

labor for the fulfillment of the production task from the plan, it must accept and process
the object of labor, carry out processing operations for the object of labor and have
various means of labor.
All these features of the business process should be reflected in the model. Thus, the
business process is a complex object. Therefore, due to its complexity, no model can
fully reflect its properties. But each local BP model reflects only certain aspects of BP
and their properties.
2 The Purpose of Creating a Basic Business Process Model
The business process includes many objects or subjects, many special processes,
subjects and means of labor, and also includes the methodology and technology and the
responsible executives for the implementation of the business process. Thus, the
business process has a complex structure and composition, i.e. architecture and com-
plex components of this architecture.
Therefore, the presence of the model allows you to organize and accelerate the
process of building as components of a business process, and in the whole of the
business process, the establishment of which is planned.
The resulting model of a complex business process will allow:
– to establish and disclose the composition, structure and architecture of a complex
business process of the selected class,
– build optimal business process models;
– automate complex business process;
– to operate and manage a complex business process. The peculiarity of the basic
model of a complex business process is that it represents a complex business
process from a variety of special-purpose processes, i.e. special processes. Each of
which is described by conceptual, logical and procedural models of the basic model
of a complex business process.
In addition, it serves as the basis for all phases of the life cycle of a business process
and automation system, i.e. the model should support project processes: from the pre-
project stage to the decommissioning (or inheritance) of both the business process and
the automation system.
Thus, the model is a supporting tool for creating a business process and systems to
ensure that the general requirements that are imposed on the business process are met [1].
Hence the formulation of the problem on the process of building business

processes:
• accelerate the process of business process and the creation of systems according to
time Tв ! min or RiTв(Opi) ! min;
• improve the quality indicators of the business process and the creation of systems
Kп ! max or RiKп(Opi) ! max,
• reduce labor efforts Tp ! min or RiTp(Oтi) ! min;
– where Tв and Tв(Opi) is an indicator of time: the total time spent on creating a
business process and the time it takes to complete each Opi operation; Kп and
Kп(Opi) - quality indicators: common for the business process (and/ or system) and
for the performance of each Opi operations.
These requirements must satisfy the business process model. On the other hand, it
is very difficult to build a universal model for all industries. Therefore, this paper
discusses the construction of a model for a class of business processes in a specific area,
namely for LP logistics, which should ensure the creation of logistics business pro-
cesses and an automation system for a given business process, for example, for LPi 2
LP, so that the general requirements are met [2]:
• Creation of a model that provides descriptions and constructions of a wide class of
business processes in the sector and automation system, i.e. (KS ! max).
Where KS is the number of business processes and automation systems
• The list of implemented functions for each generated business process and system
should be wide enough for missions (KF ! max), i.e. full functionality for each
case of the creation of the system. Where KF is the number of implemented
functions.
• The completeness level of each function should be sufficient for the mission (ZF !
max) to complete the business process and system. Where ZF is the level of
completeness of each implemented function.
3 Local Business Process Models: Choice, Purpose

and Operation
Define the purpose and function of each individual local model in this way. A business
process is an object of the outside world. And any object of the external world is
characterized by a conceptual representation, i.e. place in the “world of things”, which
occupies this object among other objects and a set of distinctive properties, as well as
the nature of communication with other objects of the external world. Therefore, the
business process as an object of the outside world should be characterized by a concept,
i.e. conceptual representation.
And, as is well known, the conceptual features of an object (that is, a business
process) must be presented in the form of a special model - a conceptual model.
Note that the object is an element of a united information space (UIS), hence the
conceptual model of the business process is an element of the UIS.
A business process, as an object of the outside world, must be represented by a
conceptual model (CM or CMBP is a conceptual model of a business process).
It should be noted that the object is conceptually presented for which purpose
separately. The business process is intended for production and is a managed object.
Therefore, the CM of a business process is characterized by its mission, targets or
purpose and criterion, input and output (result) of data.
• In addition, the composition of the input and output depends on what for (for what
purpose) we build the CM for the BR. It should be noted that the CM is building to
solve the problem of integration. Therefore, for us, the CM needs to ensure that it
integrates the business process of logistics with the business processes of other
organizations, for example, at the top level with partners (suppliers and consumers
of goods and machines and equipment) of logistic processes;
• at the lower level with business processes of other, for example, neighboring local
problem areas.
Thus, our business process must be able to integrate with the business processes of
other organizations. Accordingly, the inputs and outputs of the CM at the level of the
logistics business process should be harmonized with the business processes of other
organizations [6]. And for integration, the following data is needed:
• the internal structure of the logistics sector, local problems of the region, its com-
position, capacity;
• objects of labor, source and flow of goods, types of goods;
• means of labor, which means of transporting goods between warehouses and cus-
tomers, transportation of goods within the warehouse;
• what outsourcing operations are available, etc.
Since the field of logistics consists of two levels: the general problem area of
logistics and local problem areas, which constitute the general problem area, while they
have a different environment and environment.
3.1 External Conceptual Model (CM1)

CM1 serves as a data/information transfer tool for processing procedures, sets or dis-
plays a characteristic of a business process for integration with external business
processes and links of a logistics business process showing goals of a given super-
system at the macro level, information about its business process of a common problem
area as an element of a united information space (UIS).
In addition, CM1 should contain information for external organizations on the
internal structure of the logistics business process: a list of local problem areas and a
metamodel of various integrations by different classes of business processes.
3.2 Internal Conceptual Model (CM2)

CM2 business process of the local problem area of logistics, will be written in the form:
CM2 ¼ fCM2j : j ¼ 1; mg ð3:1Þ
where the j-th local problem area sets the information as:
(1) a list of specialized processes included in the created business process of the local
problem area.
(2) the number and types of specialized processes dependent on the problem being
solved and on the characteristics of the business process itself and its specialized
processes.
(3) on the specialized processes of a given business process in a local problem area,
(4) as well as the metamodel (descriptions) of the integration of specialized processes
within a business process for a specific purpose within the problem area.
(5) input and output data characterizing this business process as an element of a single
process information space (UIS) of the second level.
The model is designed to automate business processes. Therefore, we construct a
model for classes of business processes that are observable and controlled.
This is achieved by introducing the first phase of the strategic process, which is the
beginning of the management process. Therefore, for the managerial level, the spe-
cialized process, we introduce the strategic process from which we must begin.
It is generally accepted that the business process model is represented in two ways
either as it is or as it should be. In order to present the business process “as it should be”
(and should be observable and manageable), it is necessary to conduct the business
process from the model presentation “as is” by carrying it into a model presentation as
it should be.
This is achieved by reengineering. Business process reengineering (BPR) is defined
as a “fundamental rethinking process for all business metrics such as cost, speed,
quality, and service.” BPR or not. Although it is a concept of BPR programs, it has
been one of the most successful programs (e.g. 70%) [7, 8].
Business process reengineering must be carried out for a specific purpose, for
automation, for monitoring and for managing the business process, which as a result of
the organization needs methods and the integration of knowledge management models
to understand the environment, which includes processes, people, employees, cus-
tomers and tools.
To obtain a managed version of a business process, it is achieved by entering a
control loop that performs a number of coordinating and controlling functions that are
realized from processes consisting of operators, for example, processes (consisting of
operations): strategic processes (solutions), logical-operator, service (strategic model,
logical model, operator model, service model. C = {cijk}, k = 1, Kij) and this is for
each specialized BP process: analytics, administration, organization, management,
technological processes, provision of resources, services in the local problem area, for
example, B = {bij}, j = 1, Ji. Manageability is also allowed by the introduction of
additional functions - and this is achieved by services.
Processes or operators are controls. And the means (control actions) of management
are specialized processes and additionally introduced operations.
Such a model is built for each local area, which we have isolated from the logistic
process: receiving goods, storing and picking goods, shipping, shipping, receiving a
temporary storage warehouse, etc. A = {ai}, i = 1, I.
Thus, the process of reengineering can be explained by the following definitions:
(business process diagram “as it should be”) is carried out to increase productivity, to
increase efficiency, to give manageability. This means that it is necessary to neglect all
existing structures related to the procedures, to invent new ways of completion, to carry
out the work and to complete them in record time. Reengineering is an update business
process that starts with assumptions and does not take everything for granted.
Therefore, in the business process model (i.e., in the base model) we introduce
additional operations that are created from outside by the model developers themselves,
i.e. no in the BP itself. These are operations of strategies, logic, operators and services.
Thus, the manageability of a business process is gained by introducing a control
loop. Moreover, first of all, the strategic process of the control loop, which is the
beginning of the control process. Therefore, to give BP a managerial character, we
introduce the concept of a strategic process.
3.3 The Strategic Model (SM)

The strategic model (SM) for all local problem areas is written as: SM = {SMj: j = 1,
m}, where the j-th local problem area. Here, just as with {CM2j}, we assume that the
local problem area includes: receiving goods, storing and picking goods, shipping,
shipping, receiving a temporary storage warehouse, etc.
The strategic model constitutes a strategic level of business process management
which is designed to construct, before performing a business process task, the content
and structure of the business process based on the current situations that have arisen in
the production environment, i.e. in a business process environment.
In all local problem areas of logistics, the strategic model of the SM business
process determines the option of jointly performing specialized processes based on the
current situations in production before the business process. The variant of joint per-
formance of operations of specialized processes may be different, for example:
• first, the administrative operation (or decisions) under the current situation St1
should recruit staff for the execution of all operations of the business process, and
then organizational operations (for example, distribution of functions between
personnel for the entire BP) for all technological operations will determine the
operations of the organizations, and management operations appointed and main-
tains management based on the organizational structure and composition;
• and in another current situation, St2 should carry out a set of frames for each
operation before performing them, these actions for each operation are performed
separately, and then organizational operations (for example, distribution of func-
tions between frames for the entire BP) for current technological operations will
determine the operations of the organizations and lead management based on
organizational structure and composition.
At time t, the production situation for a business process is determined as follows:
SPðtÞ ¼ \ZðtÞ; JbðtÞ; SlðtÞ; EPðtÞ; BPðtÞ [ ; ð3:2Þ
• SP(t) - production situation that arises before the execution of the business process,
• Z(t) - the purpose or purpose of the business process,
• Jb(t) - setting at the current time,
• Sl(t) - the subject of labor at the current time,
• EP(t) - factors and objects of the external environment that have a direct impact on
the implementation of the business process,
• BP(t) - the state of the business process, characterized by the values of the business
process indicators.
To make strategic decisions, production situations are divided into two classes of
situations:
If for the current production situation the conditions SP(t) 2 SP1, are met, then the
necessary list of specialized processes is selected (the necessary list of types of active
specialized processes) that are necessary to perform the specified task by this business
process, as well as their priorities for each of them. In the current production situation
of SP(t) 2 SP2, the k-th option of the specialized process is selected.
If SP(t) 2 SP3, then for the selected variant (k-th variant) of a specialized process,
the set of operations is determined, and the meta scheme for performing the sequence
of operations, in which the allowed combinations of the sequence of operations (Oph
Opk) reflect given the current situation. An admissible combination of an operation is
established on the basis of the semantics of a relation, which is determined from the
ontological model. Expression (Oph Opk) has the following meaning: a sequence is a
valid combination of a sequence of operations where Oph is the h-th class of opera-
tions, Opk is the k-th class of operations, is a sequence operation. This model performs
the role of a scheduler that plans to complete the business process of an upcoming order
or order.
4 Logical Model of the J-th Specialized Process (LMJ)
Consider the purpose and principles of action of the logical model (model of decision
making) of the j-th separate specialized process from the stack of the specialized
process of the business process of the local problem area, i.e. j = 1, J.
The strategic model determines when and in what sequence processes are applied
and executed. All of these methods are a process of organization and management. The
purpose of the logical model is to define the sequence of business operations of each
special process. Each business transaction consists of two parts: an operator and a
procedure. Therefore, the logical model contains two levels.
At the top level, selects operators selected a single character representation of a
specialized process from a stack of specialized processes of a business process of a
local problem area (for example, technological or organizational, or providing process
resources) of one and that special process and follow these operators from operations.
The choice is made on the basis of the current situation (i.e., given the initial situation
of the problem area).
In a different way, this level of the logical model is an operator model of a spe-
cialized process of a business process of a given local problem area.
For two production situations, the composition and sequence of operations within a
single logical model may differ. For example, in the initial situations Stek(i) 2 Sst and
Stek(j) 2 Sst, in particular, in the following form has the form:
In a production situation Stek(i) 2 Sst:
Pri ¼ \Opi1 ; Opi2 ; Opi3 ; Opi4 ; . . .; Opit ; Opit þ 1 ; . . .; Opi [ ; ð4:1Þ
In a production situation Stek(j) 2 Sst:
Prj ¼ \Opj1 ; Opj2 ; Opj3 ; Opj4 ; . . .; Opjk ; Opjk þ 1 ; . . .; Opj [ ð4:2Þ
where
Pri, Prj - specialized processes of business process BP;
mi, mj, are the number of business transactions in Pri, Prj, respectively, in the
general case mi 6¼ mj;
Opit 2 Pri, Opjk 2 Prj - operations in specialized processes SPpi, SPpj business
process BP;
Opit(Stek(i)) (or [Opit(Stek(i))] 2 SPpi), Opjk(Stek(j)) (or [Opjk(Stek(j))] 2 SPpj,), -
Opit, (or Opit(Stek(i))), Opjk, (or Opjk(Stek(j))) operations performed in situations of
Stek(i) and Stek(j), which Stek(i) 2 Sst and Stek(j) 2 Sst.
At the lower level performs the selection of procedures. For each operator, which
correspond to several procedures, therefore, based on the current situation for one
operator, one of the procedures is selected.
Thus, the logical model of the chosen i-th specialized process sets a complete list of
operations and their sequence of execution based on the admissibility requirement,
which is chosen (by the strategy) by the strategic model.
5 Conclusion
The authors of this work represent a business process as a formalized process in which
certain all types of resources, performers, owners of all types of processes are necessary
to achieve the ultimate goal of the process.
Each type of security is achieved by separate processes, which are called special-
ized business process processes. Each specialized process is modeled by a separate
model. To make the business process manageable, a control loop model is introduced,
consisting of the model:
• a strategic model that will ensure the adoption and implementation of strategic
decisions on the order of implementation of specialized processes,
• a logical model that determines the sequence of execution of operators of spe-
cialized processes after strategic decision-making and its implementation,
• a service model, which is defined by the control functions in the form of services.
Acknowledgments. This work was supported by Ministry of Education and Science Republic
of Kazakhstan (Grant No. 0118PК01084, Digital transformation platform of National economy
business processes BR05236517).
References
1. Van der Aalst, W.M.P.: On the automatic generation of workflow processes based on product
structures. Comput. Ind. 39(2), 97111 (1999)
2. Vlkner, P., Werners, B.: A decision support system for business process planning. Eur.
J. Oper. Res. 125(3), 633647 (2000)
3. Preparation of Papers in a Two-Column Format for the 2018.: In: 18th International
Conference on Control, Automation and Systems (ICCAS 2018)
4. Zhang, Y., Feng, S.C., Wang, X., Tian, W., Wu, R.: Object-oriented manufacturing resource
modelling for adaptive process planning. Int. J. Prod. Res. 37(18), 41794195 (1999)
5. Zhang, F., Zhang, Y.F., Nee, A.Y.C.: Using genetic algorithms in process planning for job
shop machining. IEEE Trans. Evol. Comput. 1(4), 278289 (1997)
6. Duisebekova, K., Serbin, V., Ukubasova, G., Kebekpayeva, Z., Aigul, S., Rakhmetulayeva,
S., Shaikhanova, A., Duisebekov, T., Kozhamzharova, D.: Design and development of
automation system of business processes in educational activity. J. Eng. Appl. Sci. 8, 4702–
4714 (2017) (ISSN:86-949X, Medwell Journals)
7. Dabbas, R.M., Chen, H.-N.: Mining semiconductor manufacturing data for productivity
improvementan integrated relational database approach. Comput. Ind. 45(1), 2944 (2001)
8. Musa, M.A., Othman, M.S., Al-Rahimi W.M.: Ontology driven knowledge map for
enhancing business process reengineering. J. Comput. Sci. Eng. 3(6), 11 (2013) (Academy &
Industry Research Collaboration Center (AIRCC))
9. Lila, R., Gunjan, M., Kweku-Muata, O.-B.: Building ontology based knowledge maps to
assist business process re-engineering. J. Decis. Support Syst. 52(3), 577–589 (2012)
Sparsity and Performance Enhanced
Markowitz Portfolios Using Second-Order
Cone Programming
Noam Goldberg1(B) and Ishy Zagdoun2

1
Department of Management, Bar-Ilan University, Ramat Gan, Israel
noam.goldberg@biu.ac.il
2
Department of Mathematics, Bar-Ilan University, Ramat Gan, Israel
ishy.zagdoun@earnix.com
Abstract. A mixed-integer second order cone program (MISOCP) for-

mulation is proposed for solving Markowitz’s asset portfolio construction
problem under a cardinality constraint. Compared with a standard alter-
native big-M linearly constrained formulation, our reformulation is solved
significantly faster using state-of-the-art integer programming solvers.
We consider learning methods that are based on the MISCOP formula-
tion: cardinality-constrained Markowitz (CCM) solves the MISCOP for
a given cardinality k and training set data of asset returns. We also find
reinforcing evidence for factor model theory in the selection of factors
to form optimal CCM portfolios. For large datasets in the absence of a
hard-cardinality constraint, we propose a method (CCM-R) that is based
on the continuous relaxation of our MISCOP, where k selected by rolling
time window validation. In predictive performance experiments, based
on historical stock exchange data, our learning methods usually outper-
form a competing extension of the Markowitz model that penalizes the
L1 norm of asset weights.
Keywords: SOCP · Markowitz · Perspective reformulation · Sparsity
1 Background and Motivation

Asset portfolio construction is a fundamental problem in finance. Markovitz’s [10]
model attempts to strike a balance between a portfolio’s risk and its expected
return. For n assets, let the asset index set N = {1, ..., n}. Also let μ ∈ Rn
denote the vector of asset expected returns. Let C ∈ Rn×n denote a symmetric
positive semidefinite matrix whose element cij is the covariance of returns of
assets i, j ∈ N . Typically, both the vector μ and the matrix C estimated using
historical data. In particular, given asset returns data over t time periods, R ∈
T T T T
Rt×n , μ = μ(R) ≡ R t 1 and C = C(R) ≡ (R−1µ(R) ) t (R−1µ(R) ) . Let x ∈ Rn be

https://doi.org/10.1007/978-3-030-21803-4_87
872 N. Goldberg and I. Zagdoun
the vector of decision variables, for each i ∈ N , component xi is the proportion

of capital invested in asset i. The classical Markowitz portfolio optimization
model [10] is to determine an x that minimizes the portfolio’s variance V (x) =
xT Cx while constraining the expected return r(x) = μT x to equal a given rate of
return ρ > 0. Brodie et al. [3], and DeMiguel et al. [4] report on the disappointing
performance of Markowitz’s model when applied to financial data. Regularization
in the form of different portfolio weight penalties, has been applied to improve
the performance of the basic Markowitz’s model. An L1 -norm based penalty that
is applied to the vector of portfolio weights x by Brodie et al. [3], resulting in
the quadratic programming (QP) formulation

n

min V (x) + ζ ||x||1 r(x) = ρ, xi = 1 . (1)
x∈Rn
i=1
Note that this formulation can be rewritten as a smooth one by substituting for
each variable the difference of two nonnegative variables and then replacing ||·||1
in the objective by the sum of these variables. Another commonly used penalty
is the L2 -norm. The use of an L2 -norm based penalty appears to stabilize the
inverse covariance matrix which is often ill-conditioned and also it is found to
reduce the estimation error of the covariance matrix [8]. We will consider a
discrete, also known as an “L0 -norm” constraint, where the quantity ||x||0 =
|{i ∈ N | xi = 0 }| is bounded by some given positive integer, together with an
L2 -norm penalty in the objective of our portfolio optimization formulations.
Lobo et al. [9] consider the portfolio selection with transaction costs. For the
case of fixed fee transaction costs they suggest using mixed-integer program-
ming. Additional motivations for using integer based constraints include buy-in
thresholds, diversification and round-lot constraints [2].
Closely related to Markowitz’s model is the Sharpe ratio which incorporates
both the portfolio’s expected return and its risk (as indicated by the standard
deviation of its returns rather than variance) into a single measure of perfor-
mance [12]. If μf is the risk-free return, then the Sharpe ratio of a portfolio x is
r(x)−µf
defined as √ , which attempts to measure how well the return of an asset
V (x)
may for its associated risk. Portfolio optimization models are also related to asset
pricing theory and factor models. In particular, the capital asset pricing model
(CAPM) lays the foundations for factor modeling with the market being the sole
factor. Fama and Franch [5] extend this notion and show that more than 90%
of a diversified portfolio’s return variance can be explained by the three-factor
model, consisting of the size factor (“small minus big”), the value factor (“high
minus low”) and the market factor.
2 Formulations for Sparse Portfolios

Herein, we consider a simple cardinality constraint in order to promote sparsity.
For a given penalty parameter q, a bound on the number of active assets k > 0
and required return ρ, our starting point is a mixed integer quadratic formulation
Sparsity and Performance Enhanced Markowitz Portfolios 873
min V (x) + qxT x (2a)

x∈Rn ,z∈{0,1}n
subject to r(x) = ρ (2b)

n
xi = 1 (2c)
i=1
n
zi ≤ k (2d)
i=1
|xi | ≤ M zi i ∈ N. (2e)
Note that if short-sales are disallowed then x ≥ 0, and it would suffice to set
M = 1 without excluding any of the optimal solutions. In general, an upper
bound M ≥ maxi∈N |x∗i |, for all optimal solutions x∗ may tend to be even
larger. The magnitude of M (relative to the smallest component of x∗ ) directly
affects the quality of the continuous relaxation of (2). We consider a nonlinear
reformulation that replace the big-M constraints (2e) using additional variables
ui for each i ∈ N . The following formulation equivalent to (2) (with respect to
the set of optimal solutions) is based on the perspective reformulation technique
described in [1,6,7],
n

Z∗ = min V (x) + q ui (3a)
x,u∈R ,z∈{0,1}n
n
i=1
subject to (2b) − (2d) (3b)
x2i ≤ u i zi i ∈ N. (3c)
Note that in every optimum solution (x̂, û, ẑ) of (3) if zî = 1 then x̂i 2 = ûi ,
otherwise x̂i 2 = ûi = zî = 0. Also, this formulation is tighter than (2) as for
each i ∈ N it replaces the upper bounding constant M by a presumably smaller
nonnegative variable ui that is being minimized in the objective function. The
constraints (3c) are nonconvex but evidently the set of solutions that satisfy it
correspond to a (convex) rotated second order cone, and which can be reformu-
lated as second-order cone constraint (SOC) (see [7] for more details).
3 Tightening the Perspective Relaxation
We now suggest a scheme to further tighten the formulation (3), when the min-
imum eigenvalue of C, λmin satisfies λmin > 0 (i.e. C is positive definite). Given
λmin > 0 consider the tightened formulation
n

Ẑ ∗ ≡ min xT (C − λmin I)x + (q + λmin ) ui (4a)
x,u,z
i=1
subject to (3b) − (3c). (4b)
Proposition 1. The optimal value of (3) equals the optimal value of (4), that
is Z ∗ = Ẑ ∗ .
Proof. Let (x∗ , u∗ , z ∗ ) be an optimal solution of (3), and (x̂, û, ẑ) an optimal
solution of (4). For each i ∈ N note that x̂2i = ûi : Otherwise, if (x̂, û, ẑ) is
optimal to (4) with x̂2i < ûi for some i ∈ N (by feasibility (3c) x2i ≤ ui ) then
ûj − , j = i
for 0 < < ûi − x̂2i , define ūj = , and observe that (x̂, ū, ẑ)
ûj , otherwise
is feasible (because ui appears only in the constraint (3c) which is inactive for
i ∈ N and since zi ∈ { 0, 1} , x̂2i < ūi and accordingly x̂2i < ẑi ūi . Further,
n n
x̂ (C − λmin I)x̂ + (q + λmin ) i=1 ūi < x̂T (C − λmin I)x̂ + (q + λmin ) i=1 ûi ,
T
there by establishing a contradiction. Hence, (x̂, û, ẑ) must satisfy x̂i = ûi for 2
all i ∈ N . Since (x̂, û, ẑ) is feasible for (3); it follows optimality of (x∗ , u∗ , z ∗ )
T n n
to (3), that Z ∗ = x∗ Cx∗ + q i=1 u∗i ≤ x̂T C x̂ + q i=1 ûi = Ẑ ∗ . On the
other hand, given an optimal solution (x∗ , u∗ , z ∗ ) of (3) for each i ∈ N , by
2
applying a similar argument to formulation (3), for all i ∈ N, x∗i = u∗i . Thus,
T
since (x∗ , u∗ ,
z ∗ ) is feasible to (4) and by optimality of (x̂, û, ẑ) to (4), x∗ (C −
n n
λmin I)x∗ + q i=1 u∗i ≥ x̂T C x̂ + q i=1 ûi = Ẑ ∗ .
In our experiments we also consider a formulation similar to (4) nexcept that it

replaces (2c) by an inequality constraint. In this formulation 1− i=1 xi amounts
to the proportion of the risk-free investment in cash.
4 Our Data Driven Sparse Portfolio Methods

We now describe our data-driven methods based on formulation (4) and explain
how its parameters are determined. The first method that deploys formula-
tion (4) in a straightforward manner using a fixed value of q and a given choice of
k is referred to as cardinality-constrained Markowitz (CCM). We also consider
unnormalized cardinality-constrained Markowitz (CCM-UNRM) as mentioned
in the previous section as a variant that solves the formulation (4) with the
equality constraint (2c) replaced by a less-than inequality. Finally, in order to
handle larger data sets we also run the continuous relaxation of (4). This method
that solves the continuous relaxation with the same fixed choice of q and a choice
of k is referred to as cardinality-constrained Markowitz relaxation (CCM-R). The
algorithm partitions a given training dataset further into (inner) validation time
windows. The number of windows lV is determined based on the given valida-
tion training set size tparam-train and the validation test data size tvalid . For each
training window, the optimal portfolio is determined by solving a continuous
relaxation of (4) and is evaluated on the validation data. The average Sharpe
ratio over all validation sets is used to evaluate the quality of each candidate
parameter value k. Using the best performing k, along with the fixed q and given
ρ, optimal portfolio x∗ is then determined from the optimal solution (x∗ , u∗ , z ∗ )

of the continuous relaxation of (4).
The manner in which we choose the penalty parameter ζ in (1) is simi-
lar to [3] although it is implemented differently. In contrast to the specialized
LARS algorithm’s parametric solution path in [3], we developed an effective yet
straightforward binary-search based algorithm to set the ζ parameter while using
a general purpose QP solver. The Algorithm chooses a ζ such that the constraint
on the number of active assets k is satisfied. The bound controls either the num-
ber of nonzero optimal solution variables, or the number of negative variables
by using the function definition Binary search is applied to determine the mini-
mal penalty parameter ζ with corresponding optimal solution that satisfies the
constraint. The algorithm requires an upper bound on ζ which we set as the
maximal eigenvalue of C.
5 Computational Study and Experiments
Table 1 details the properties of the datasets used in our experiments. The results
for computational running times are based on a monthly dataset of S&P 500
stocks In this dataset, to handle missing data, we removed rows and then columns
with more than 50% and 4% of missing values, respectively. Remaining missing
values were replaced by the most recent available values. After handling miss-
ing values in this manner the resulting dataset has 221 rows and 359 columns
(dataset SP500 221). We created two additional smaller datasets based on the
S&P 500 using the first 50 and 60 assets (datasets: SP500 50 and SP500 60,
respectively) of SP500 221. The predictive performance results were performed
on the other weekly stock index data listed in the table. In addition, we used a
recent update of the monthly data of FF48 and FF100 of Fama and French [5].
An earlier version of this data was also used by Brodie et al. [3].
Table 1. Dataset properties
Dataset n t Freq Period

SP100 84 520 Weekly Jan 1, 2008–Dec 26, 2017
SP500 428 520 Weekly Jan 1, 2008–Dec 26, 2017
SP500 50 50 359 Monthly Jan 1, 1970–Jan 1, 2016
TSE 326 520 Weekly Jan 1, 2008–Dec 19, 2017
LSE 620 520 Weekly Jan 1, 2008–Dec 19, 2017
TLV100 61 200 Weekly Jan 1, 2013–Feb 22, 2017
FF48 48 560 Monthly Jan 1, 1970–Aug 1, 2016
FF100 100 640 Monthly Jan 1, 1965–Apr 1, 2018
5.1 Integer Programing Computational Experiments.

We first experimented to evaluate the running times. We implemented the models
on the R platform using the state-of-the-art Gurobi [11] solver. The experiments
were run on a server with 2.4 GHz CPUs, each with twelve cores and a 12 MB
cache size.
Table 2. CPU seconds and B&B nodes for the mixed-integer formulations on the
SP500 50 data. A dark gray background indicates that the run is irrelevant (as k¿n).
Formu. (2) Formu. (3) Formu. (4)

k Time Nodes Time Nodes Time Nodes
2 0.48 1596 2.09 830 3.31 638
4 33.5 305322 12.30 13328 13.68 8630
6 5040 20526748 30.30 58237 12.74 14303
8 LIMIT 51.30 100504 13.54 10268
10 LIMIT 55.40 96463 12.99 11043
14 LIMIT 102.00 362137 4.07 3022
20 LIMIT 55.20 153142 0.92 243
30 LIMIT 0.08 0 0.07 0
Table 2 displays computational running times of formulations (2), (3) and (4).
The table compares the computational performance running times and branch-
and-bound (B&B) nodes with an optimality gap of 1%. It is evident that, other
than for small k, formulation (4) is solved with the least B&B nodes and in
most cases the least CPU time. Further, for intermediate values of k formula-
tion (4) solves the given instances within a reasonable running time, while for-
mulations (2) and (3) cannot be solved within the time limit of 2 hours. Table 3
displays computational running times and number of B&B nodes for our choice
of formulation (4) on several datasets. Here it is demonstrated that our chosen
formulation (4) effectively solves the cardinality-constrained integer problem for
real moderately sized financial data.
In order to handle larger datasets we also consider the continuous relaxation
of our integer formulations. Table 4 displays the optimal objective values of the
continuous relaxations of (2), (3) and (4) compared with the optimal integer solu-
tion. Evidently, the continuous relaxation of (4) has consistently larger objective
values demonstrating that indeed it is a tighter continuous relaxation for the
discrete problem. This is while the continuous relaxation of (2) has an opti-
mal objective value that does not significantly exceed that of the non-cardinally
constrained problem (with k = n) and thus does not appear to provide a use-
ful, sufficiently tight, continuous relaxation. Note that the results of this table
motivate our choice of a continuous relaxation and also provide an explana-
tion the difference in computational performance of solving the corresponding
mixed-integer problems as shown in Table 2.
Table 3. CPU seconds and B&B nodes for solving formulation (4) on the indicated
datasets. LIMIT is indicated for runs reaching the time limit of two hours.
FF48 FF100 S&P 500 LSE

k Time Nodes Time Nodes Time Nodes Time Nodes
2 0.11 0 1.4 31 64.0 7 204.1 9
4 0.13 0 1.6 27 404.5 401 298.0 63
6 0.14 0 0.3 0 107.5 87 436.9 87
8 0.13 0 2.3 75 107.0 99 511.1 111
10 0.10 0 2.5 99 10.8 0 522.0 135
20 0.10 0 4.7 240 216.3 244 841.8 232
30 0.09 0 5.9 340 11.9 0 669.4 348
40 0.11 0 0.3 0 11.1 0 829.5 466
50 0.18 0 0.3 0 10.4 0 1030.0 575
100 0.27 0 10.34 0 562.14 294
5.2 Predictive Performance Experiments

The experimental evidence given in the previous section reinforces our choice
in our learning methods to solve the apparently stronger formulation (4). The
magnitude of the parameter q > 0 in this formulation did not appear to have
a significant effect on the predictive performance in preliminary experiments
with portfolios constructed using formulation (4) (on the dataset SP500 221). A
penalty parameter value of q = 0.131 seemed to perform best out of a range of
values tested in (0, 0.5]. Thus, this value of q is fixed in all of the experiments
that follow on all of the remaining datasets.
Table 4. Optimal objective value of the continuous relaxations vs. the optimal integer
solution on the SP500 221 data.
k (2) (3) (4) Z∗

2 2.74E-03 6.54E-02 6.64E-02 6.78E-02
4 2.74E-03 3.28E-02 3.32E-02 3.38E-02
6 2.74E-03 2.19E-02 2.22E-02 2.25E-02
8 2.74E-03 1.64E-02 1.66E-02 1.68E-02
10 2.74E-03 1.32E-02 1.33E-02 1.34E-02
12 2.74E-03 1.10E-02 1.11E-02 1.12E-02
An outer partition of the data in a rolling window fashion is used for model
comparison. (An inner partition is used for some of the methods for fine tuning
parameter values.) In our experiments, each (outer) window consists of a training
set sized ttrain and a test set corresponding to the next ttest trading days. Given a
total of tall observed days, then the data is split into (tall − ttrain )/ttest disjoint
test sets.
In our experiments each training set size is set equal to 20% of the entire
dataset, specifically ttrain =
t/5. The test set amounts to approximately 5% of
the data resulting from setting ttest =
0.25ttrain . Consequently, for each of the
datasets that we experiment with there are 16 test time windows. In implemen-
tation of CCM-R we had 5 inner time windows with tparm-train =
ttrain /2 and
tvalid =
ttrain /5. Also, numerically the TSE data necessitated adding to C a
small positive diagonal with entries approximately equal to 7 × 10−7 to make it
positive definite.
Predictive Performance Comparison for Unspecified Cardinality. The
methods designed for this setting include our learning method CCM-R, that
determines the parameter k in the continuous relaxation of (4) by additional
validation experiments, and compared it with Brodie et al., Markowitz and a
naive equal weight portfolio. We also compared it with additional benchmark
corresponding to (4) with k = n, essentially amounting to a Markowitz for-
mulation with an additional L2 penalty term in the objective. This method is
referred to as L2 -Markowitz. The q penalty parameter was set to the same fixed
value as CCM-R. The predictive performance comparison was evaluated on the
datasets SP500, TSE, LSE and TLV100. The choice of the parameter value k
performed in the inner time windows of CCM-R out of the set of candidate
values K = {n/100, n/4, n/2, 3n/4, n}.
Brodie et al.’s method was implemented by binary search for the minimum
value of ζ such that all of the xi ’s are nonnegative (no-short positions). This
setting disallows short positions and is similar to the experimental setup in [3]
when the portfolio cardinality is unspecified. In order to facilitate a meaningful
comparison we impose the corresponding nonnegativity constraint x ≥ 0 in the
Markowitz and CCM-R (the continuous relaxation of (4)) formulations.
Table 5 shows that average test Sharpe ratios for the five methods on each of
the four datasets. The results of the table show over all of the datasets CCM-R
performs better than Markowitz and the naive method in terms of the average
test Sharpe ratio. It performs better than Brodie et al.’s method [3] in nearly all
cases. When relaxing the cardinality constraint of the CCM-R formulation (4)
the resulting L2 -Markowitz method peforms best in nearly all cases but CCM-R
is a very close second when it is not best.
In addition, we checked the cardinality of the optimal portfolio vectors for
each formulation. Table 6 shows the average and standard deviation of the num-
ber of assets held determined as the number of components whose absolute value
is greater than 10−4 . As it becomes apparent the portfolios constructed by our
method involve more assets than Brodie and Markowitz. However, compared
with L2 -Markowitz, on average CCM-R portfolios consistently hold fewer assets
over all of the datasets. Overall it appears that CCM-R strikes a sensible bal-
ance between the performance of L2 -regularized Markowitz and sparsity such as
that attained by Brodie’s formulation. The CCM-R results with relatively dense
portfolios are consistent with the fact that the choice of k in CCM-R is based
on the best Sharpe ratio performance in validation experiments.
Table 5. The average test Sharpe ratio in the unspecified cardinality (k) setting.
Formulation LSE SP500 TLV100 TSE

Markowitz 1.37E-01 1.38E-0 1.01E-01 2.81E-04
Brodie [3] 1.63E-01 1.38E-01 1.09E-01 5.87E-03
Naive Model 2.54E-03 1.27E-01 −7.46E-03 1.60E-05
CCM-R 1.88E-01 1.50E-01 1.20E-01 2.44E-02
L2 -Markowitz 1.45E-01 1.84E-01 1.23E-01 2.71E-02
Table 6. The average number of assets held in the unspecified cardinality (k) setting.
Formulation LSE SP500 TLV100 TSE

Markowitz 99.50 ± 49.59 29.50 ± 9.42 19.88 ± 6.64 61.63 ± 21.01
Brodie [3] 68.00 ± 12.24 23.00 ± 8.56 15.88 ± 4.64 43.13 ± 10.56
Naive Model 620.00 ± 0.00 428.00 ± 0.00 61.00 ± 0.00 326.00 ± 0.00
CCM-R 279.06 ± 113.84 165.31 ± 145.24 32.94 ± 15.19 119.31 ± 74.65
L2 -Markowitz 527.13 ± 59.26 373.75 ± 60.30 52.06 ± 5.79 238.31 ± 42.25
Predictive Performance for a Fixed Number of Assets. We also compared

the alternative methods for a given required cardinality parameter k. The com-
parison is performed by a sequence of rolling time window experiment for each
given value of k comparing the CCM method with Brodie et al.’s [3] method.
Brodie et al.’s method was implemented using binary search to determine the
penalty parameter ζ. Further, in order to facilitate full control of ||x∗ ||0 to sat-
isfy any given bound k the formulation (1) is solved with the normalization
constraint replaced by an inequality, so 1xT ≤ 1. As noted for a given feasible
portfolio solution x, the quantity 1 − 1T x also has the meaning of the proportion
of capital invested in (risk-free) cash.
To facilitate a more elaborate comparison with Brodie et al’s method in this
case where controlling ||x||0 required relaxation of the cardinality constraint, we
also ran CCM-UNRM, which solves a formulation similar to (4) but with (2c)
replaced by a less-than inequality.
For each value of k we calculated the Sharpe ratio. For CCM and CCM-
UNRM, the number of the active assets is directly enforced by the imposed
constraint (2d). In our implementation of Brodie et al.’s method [3] it is con-
trolled by determining the corresponding penalty parameter ζ by binary search.
Figure 1 displays the Sharpe ratio on the datasets SP100 and FF48.
Figure 1(a) illustrates the experiments on SP100 dataset, where for each number
of active assets, CCM appears to outperform Brodie’s method. On this dataset,
for k ≥ 45 CCM-UNRM outperforms Brodie et al.’s method. In CCM and CCM-
UNRM the performance as indicated by the Sharpe ratio appears to increase in
k. This is in contrast to Brodie’s method in which the performance seems to
deteriorate as k increases. In Fig. 1(b), which depicts the performance on FF48
dataset, CCM performs outperforms Brodie et al.’s method but CCM-UNRM’s

performance appears inferior to the other two methods for small values of k.
(a) SP100 (b) FF48
Fig. 1. A set of figures displaying the Sharpe ratio vs. k for Brodie et al.’s method,
CCM and CCM-UNRM on the indicated datasets.
5.3 Experiments with Fama and French Factors
To explore the relationship of factor models to our sparse portfolio selection

methods we augmented the SP100 dataset, with data for the three-factor model
of Fama and French [5] over the corresponding time periods. We then ran our
CCM, CCM-UNRM and Brodie et al’s method on this new dataset.
The results of Brodie’s method seem sim-
ilar with and without factors, and for larger
values the performance appears to slightly
improve when factors are used. Figure 2 dis-
plays the average number of active factors of
the three methods in our experiments. In this
figure it becomes apparent that CCM-UNRM
deploys factors only for k ≥ 30. Brodie’s
method gradually makes increased use of fac-
tors but seldom appears to use all three fac-
tors. In contrast, CCM makes use of all three Fig. 2. Average number of active
factors vs. k.
factors for small values of k (even k slightly
larger than 3).
6 Conclusion
We show that our MISOCP formulation enables the solution and faster solu-
tions of realistically sized problems using standard solvers. CCM appeared as
an effective learning method for a small to moderate number of potential assets,
which handles a specified hard cardinality constraint. It appeared to result in
better Sharpe ratios for each given cardinality compared with Brodie et al.’s
L1 -regularized method. Our basic CCM model also provides further evidence
in support of Fama and French’s factor models by showing that the factors
are deployed by optimal portfolios under a sparsity requirement. Our CCM-R
method is designed for large financial datasets by using instead a tight continu-
ous relaxation of the integer formulation. In our experiments, it appears to strike
a sensible balance between the best performing dense L2 -regularized Markowitz
models and the sparsity of Brodie et al.’s method.
Acknowledgement. A. Ben-Tal is acknowledged for suggesting factor models.
References
1. Akturk, M.S., Alper, A., Sinan, G.: A strong conic quadratic reformulation for
machine-job assignment with controllable processing times. Oper. Res. Lett. 37(3),
187–191 (2009)
2. Bonami, P., Lejeune, M.: An exact solution approach for portfolio optimization
problems under stochastic and integer constraints. Oper. Res. 57(3), 650–670
(2009)
3. Brodie, J., Daubechies, I., Mol, C.D., Giannone, D., Loris, I.: Sparse and stable
Markowitz portfolios. Proc. Natl. Acad. Sci. 106(30), 12267–12272 (2009). https://
doi.org/10.1073/pnas.0904287106
4. DeMiguel, V., Garlappi, L., Uppal, R.: Optimal versus naive diversification: how
inefficient is the 1/n portfolio strategy? Rev. Financ. Stud. 5, 1915–1953 (2009)
5. Fama, E.F., French, K.R.: The cross-section of expected stock returns. J. Financ.
2, 427–465 (1992)
6. Goldberg, N., Leyffer, S., Munson, T.: A new perspective on convex relaxations of
sparse SVM. In: Proceedings of the 2013 SIAM International Conference on Data
Mining, pp. 450–457 (2013). https://doi.org/10.1137/1.9781611972832.50
7. Günlük, O., Linderoth, J.: Perspective reformulations of mixed integer nonlinear
programs with indicator variables. Math. Program. 124, 183–205 (2010)
8. Li, J.: Sparse and stable portfolio selection with parameter uncertainty. Bus. &
Econ. Stat. 33(3):381–392 (2015)
9. Lobo, M.S., Fazel, M., Boyd, S.: Portfolio optimization with linear and fixed trans-
action costs. Ann. Oper. Res. 152(1), 341 (2007). https://doi.org/10.1007/s10479-
006-0145-1
10. Markowitz, H.: Portfolio selection. J. Financ. 7(1), 77–91 (1952). https://doi.org/
10.1111/j.1540-6261.1952.tb01525.x
11. Gurobi Optimization: Inc.: Gurobi optimizer reference manual. http://www.
gurobi.com (2014)
12. Sharpe, W.F.: Mutual fund performance. J. Bus. 39(1), 119–138 (1966). Supple-
ment on Security Prices
Managing Business Process Based
on the Tonality of the Output Information
Raissa Uskenbayeva, Rakhmetulayeva Sabina(&),

International Information Technology University, 050054 Almaty, Kazakhstan

ssrakhmetulayeva@gmail.com
Abstract. Algorithms that perform separate processing of absolute and relative

evaluation (of feedbacks) of the quality of activity and output of a business
process for the purpose of obtaining general conclusions about quality are
described. Then, based on the results of processing by separate algorithms of
absolute and relative evaluation, a total score i.e. a generalized assessment of the
quality of activities and products (products and resources) of the business
process is formed. Based on the results of the assessment, the weighting factors
of the products are adjusted to production plan of the business process.
Keywords: Business process Formation Tonality Absolute tonality

Relative tonality
1 Introduction
Digitization of business processes is not only about automating the execution of a

business process, but also introducing new progressive methodologies or technologies
to improve the efficiency of the business process, in particular kaizen/kanban.
Kanban (Jap. カンバン Kamban) [1] is a system of organizing production and
supply, allowing to implement the principle of “just in time”. The word “kanban” in
Japanese means “billboard sign” (Jap. 看板), in the financial environment has settled
option of erroneous transcription of the Latin Recording of Japanese word (kanban) [1].
Kaizen is a key management concept. This is Japanese philosophy, a system with a
focus on continuous improvement of all production processes, our lifestyle and all
aspects of life [2].
In this case, the following approach is possible. First, before building and launching
a business process in operation, it is necessary to determine and plan such a series of
goods that are the most attractive, but at the same time least costly goods. It is
established as a result of market research and market analysis.
In addition, marketing analysis does not end there. Due to the fact that market
preferences are changing, constant market analysis is required. Therefore, it is neces-
sary to constantly look for such a variant of the business process, which gives more
benefits and improves the quality of the business process than the previous and current
level until its digitalization. The latter is determined either by a manual analyst or
marketer of a business process or a business process has a special analytical process as

https://doi.org/10.1007/978-3-030-21803-4_88
Managing Business Process Based on the Tonality … 883
part of a business process [3], which is constantly looking for a promising mode for
executing a business process after the next business process cycle.
This paper proposes one of the options for improving the quality of the business
process among the possible, in particular, overflows will improve or make more ade-
quate to the current needs of the population in this way.
Let enterprises produce m types of goods: (a1, a2, a3, a4, a5,…, ai). And let the
initial preferences at time t = to look like this:
\ðl1;ti a1 Þ; ðl2;ti a2 Þ; ðl3;ti a3 Þ; ðl4;ti a4 Þ; ðl5;ti a5 Þ; . . .; ðli;ti ai Þ [ ;
where li;to , to is the preference coefficient (weight) of the ai -th product at time t = to.
Then, as a result of studying the opinion of the population, the coefficient of
preference at time t ¼ ti can be (improved) will become different:
where li;ti is the preference coefficient (weight) of the ai -th product at the moment of
time t ¼ ti .
Such work is necessary in marketing research in product planning, for example,
types of loans issued by banks. Each type of loan is a product or product. Then it will
be necessary to establish the most preferred type of loan by the лopпaвыф46шщзe
population and the less preferred type of loan.
It should be noted the most adequate, reliable and accurate business process
management is achieved if you manage your products using feedback. As one of the
type of feedback, you can take tonal data on the evaluation of attractiveness to con-
sumers, where product quality is the tonal data evaluated by consumers of this business
process.
Tonal expressions can be varied. In this paper, we consider only two types of
expression of the tonality of the output products of a business process:
• absolute tonality;
• comparative or relative tonality.
2 Pre-project Analysis. Practical Examples
As a subject area of research we take the banking sector of the economy. Let there be
reviews on the work of each bank. We will analyze the reviews. Tonal data are reviews
that are presented in the form of texts. An example of absolute tonality:
• The “small business” loan Kazkom Bank is very good.
• Small business is well credited by Kazkombank.
• Kazkom always credits small business well.
• Kazkombank’s lending to a small business is good.
The objects of the outside world are not always evaluated in the absolute scale of
measurement, and some reviews are given in the form of a comparative assessment.
An example of relative tonality:
• “Kazkommertsbank services much better than Halyk Bank”;
• “The service in the national bank is not worse than in Kazkommertsbank, and it
serves much better than the Caspian Bank”;
• “The conditions of micro-crediting in Centercreditbank are more beautiful than
those of ATF Bank”;
• “In Kazkom, lending to small businesses is better than microcredit”;
• “Lending to the agricultural business ray than lending to commercial transactions”,
etc.
In such cases, a relative measurement scale is used.
3 Algorithmization of the Analysis and Calculation

of Tonality Indicators
There are two options for assessing the tonality of the quality of products and quality of
service of the business process. The feedback that customers leave on the Internet
reflects the tone of one of these two aspects or both of the aspects of the same business
process.
In addition, customer reviews can contain two types of tonality: absolute and
relative.
Thus, reviews may or may contain tonalities: absolute and relative, which reflect
the following aspects of the business process: the quality of output products and the
quality of servicing customers by the bank.
All these types of key in the reviews are identified by the keywords (so far without
semantics).
We will make a difference between absolute and relative tonality assessments, since
their processing algorithms are different. That is, reviews characterizing absolute and
relative tonality are processed by different algorithms. Here, the tonality of the relative
reviews is first translated into absolute; further processing is also carried out as absolute
reviews [4].
It is assumed that each review reflects a single tone value or characteristic of the
object being evaluated.
The source of feedback is the Internet. How they are recorded does not interest us,
for us they are given. And we calculate the tonality based on the results of each cycle or
for a given period of time Ds, for example, per day. It is possible that after each time
period Ds for a full partial analysis, and then after time k*Ds full tonality analysis, for
example, for a week
Consider the processing algorithm reviews (or tonality)
First, consider the algorithm for processing all types of reviews.
1. Collection of tonal data for a certain period by selecting from various types of data
on the Internet based on keywords (without semantics) or from specialized sites.
2. To divide into absolute and relative expressions of tonality by keywords.

Absolute tonality
1. Collection of absolute expressions of feedback in the primary database, i.e. reviews
about all data {Otji}, about each product {ai}, where i ¼ 1; 2; 3; . . .; i; . . .; I;
j ¼ 1; 2; 3; . . .; j; . . .; J.
2. Selection of portions of reviews from the primary data base for analysis by a certain
amount from all reviews {Otji, product ai}, i ¼ 1; 2; 3; . . .; i; . . .; I; j ¼
1; 2; 3; . . .; j; . . .; J until the base is reset.
3. Group (classify) reviews of portion i according to the scale of grades of assessment
of tonality j2(from 1 to 11) based on the technology or the LSA method [4].
4. Calculate the number of reviews falling in the i-th batch of processing in the j-th
class (group) number K(i, j) reviews in the evaluation groups in each cycle
AðjÞ ¼ AðjÞ þ Kði; jÞ.
5. Accumulation of the number of reviews in the class (group) j A (j) for all batches of
processing i ¼ 1; 2; 3; . . .; i; . . . AðjÞ ¼ AðjÞ þ Kði; jÞ.
6. If the reviews in the primary data base are completed, then calculate the tonality
indicators using Tnac absolute expressions for the feedback using the convolution
formula.
Calculating the key figures using absolute Tnac feedback expressions using one of
the convolution formula Op ¼ FðÞ ¼ X:
P
X ¼ 1=nð ni¼1 xi Þ–the average value;
X ¼ minfxi g i ¼ 1; n or X ¼ maxfxi g i ¼ 1; n;
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pn 2ffi
X ¼ Medfxi g i ¼ 1; n, Mat waiting, X ¼ 1=nð xi Þ;
pP ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
n p
ffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pi¼1 n R t2
X ¼ 1=nð p i¼1 i x Þ, in integral form: X ¼ 1=n i¼1 Kðti Þ;, X ¼ 1=n t1 K ðtÞdt.
It is assumed that in the task of analyzing the tonality each review {Otji} carries a
certain tonality about the product ai, which is denoted here by xi.
Among them, we choose the Psimplest expression for calculating the tonality using
the formula: FðÞ ¼ X ¼ 1=nð nði¼1Þ xi Þ.
Thus, if the assessment of the tonality of all is from 11 intervals, then we write in
the form:
Xn
FðÞ ¼ X ¼ 1=nð xÞ
ði¼1Þ i
¼ TnðtnÞ ¼ 1=11ðKðd5 Þ þ Kðd4 Þ þ Kðd3 Þ
þ Kðd2 Þ þ Kðd1 Þ þ Kðd0 Þ þ Kðd1 Þ þ Kðd2 Þ
þ Kðd3 Þ þ Kðd4 Þ þ Kðd5 ÞÞ;
or
vðai Þabsolute FðÞ ¼ X;
where Tn(tn) is the tonality value at the current time tn, Kðdi Þ is the number of reviews
by the score di (the number of reviews that received the di rating), where
di 2 f5; 4; 3; 2; 1; 0:5; 1; 2; 3; 4; 5g. vðai Þabsolute is the absolute tonality of the
product ai.
The weight (or significance) of a score di (for example, by estimating di ¼ 4)
vðai Þabsolute we calculate in this way:
vðai Þabsolute ¼ KðdiÞ=½Kðd1Þ þ Kðd2Þ þ . . . þ KðdiÞ þ . . . þ Kðdm 1Þ þ KðdmÞ;
where KðdiÞ is the number of reviews by the score di , m is the total number of different
tonality points, i.e. the sum of reviews for all tonality points, cðdi Þ–the weight of the
point di (for example, di ¼ 4) as part of the collected reviews in the amount of m.
Thus, if on one i-th point there are a lot of reviews ðKðdi ÞÞ, then the weight of this
score cðdi Þ will be higher.
Relative tonality
Calculations (assessment) of the tonality of one portion of reviews selected from the
general list or at a specified period of time Ds proceed as follows.
The relative preference can be set as follows:
(1) “object i is better than object j”, i.e. i > j;
(2) “object i is worse than object j”, i.e. i < j;
(3) “object i is no better than object j”, i.e. i j;
(4) “object i is not worse than object j”, i.e. i j;
(5) “object i is equal to object j”, i.e. i = j.
The above series will transform in the following convenient:
a1 a2 a3 a4 . . .: aJ aI . . .: an2 an1 an : ð1Þ
Then the absolute value of tonality is determined in this way.

Further, from the relative variant of the evaluation of objects we translate into the
absolute scale of evaluation.
In series (1), the tonality values of all objects are estimated relative to each other in
this way. Conversion can be performed as follows.
The value of the tonality of the first member of this series or product ai is 0, i.e.
v1 ¼ 0 or vða1 Þ ¼ 0, and the last member of this series is assigned the value 1, i.e.
vn ¼ 1 or vðan Þ ¼ 1.
If the number of members in the series is n, then the tonality value of the product ai
is constructed in the following way: vðai Þ ¼ vða1 Þ þ ðan a1 Þ=ðn 1Þ.
Thus, if a series is given (1), where the number of elements in a given series is n.
Then, the assessment or weight of the element “ai” of the series is defined as follows:
vðai Þrelative ¼ vða1 Þrelative þ ðan a1 Þ=ðn 1Þ: ð2Þ
Now we compose an algorithm for processing relative expressions.

Relative tonality
1. Collection of relative expressions.

2. Creating a single chain of comparison of objects reviews.
3. Getting common reference object comparison chains from single comparison chains
of objects (1).
4. Determining the weight of the objects of feedback from the common chain by
assigning 0 and 1 to the extreme objects of the chains (2).
5. The resulting expression summarizes the tonality with absolute tonality with the
relative estimate of tonality shown below.
The total tonality of the absolute tonality with a relative assessment of tonality.
Thus, the full tonality for a certain period (for example, for a few minutes) is
defined.
1. At the same time, a certain portion of reviews for analysis from all reviews and
absolute and relative tonality are distinguished as in work [5].
2. Further, the value of the calculated absolute estimate of the tonality of the object ai
is summed with the tonality of the object (ai), typed according to the relative
estimate as follows:
i ðai Þtotal ¼ vðai Þabsolute þ vðai Þrelative ;

lcorr
where vðaI Þtotal is the total overall assessment of tonality, vðaI Þabsolute is tonality by
absolute estimate, vðaI Þrelative refers to tonality by relative assessment.
i ðtn Þ ¼ li ðtn þ 1 Þ or li ðtn Þ\li ðtn þ 1 Þ; where li ðtn Þ, is the value of li
lcorr corr corr corr corr corr
at
time tn, li ðtn1 Þ– value li at time tn þ 1 .
corr corr
4 Methods of Process Formalization and Management

of the Company’s Analytics Processes
The computed tonality of the output of the business process and the level of customer
service of the organization is used to control the operation of the business process, but
first we make the following assumptions.
Each key parameter is autonomous. Autonomy means that the decision for each key
is independent and the decision will have independent values and missions.
To do this, each group of data intended for a specific purpose, for example, for
planning, for constructing trajectories, etc. This specialization of data is convenient to
conduct a uniform action on them. Especially these are the benefits for setting or
learning data parameters [5].
The method of improving the quality of business process management in order to

fully meet customer needs can be represented as in Fig. 1. In this scheme, improved
quality of management is implemented through adjustments of knowledge and data in
the knowledge bases and data phases of the business process management cycle. The
source of data “feedback” is contained in the global Internet, in its Web sites.
Fig. 1 Effective way of doing analytics management process
Where:
DCM ¼ \DIF; DPZ; DPDZ; DPPDZ; DCS; DTC [
DCM – Database management cycle;

DIF – Database of state identification phase;
DPZ – Database planning phase goals;
DPDZ – Database phase mapping plan;
DPPDZ – Database Phase Programming Plan Achievement;
DCS – Object Management Plan Phase Database;
DTC – Management Process Improvement Phase Database;
In [6], algorithmic support of analytical processing of inverse information was

developed for the purpose of improving the quality of business process management
based on the proposed analytics method, i.e. based on procedures to improve the
quality of business process management. The feedback information is taken tonal data
on the quality of the implementation of the business process and output.
In this case, control the behavior of the business process on the example of data
planning products produced, i.e. We will demonstrate the basis for determining or
planning a number of manufactured products.
The formulation of the problem is as follows. Suppose that at the initial, at the time
to , the state of the object was CSðtÞ ¼ CSðtn Þ and at the same time the given target
situation of the object . And at the same time, enterprises produce m
types of goods: (a1 ; a2 ; a3 ; a4 ; a5 ; . . .; ai ). And let the initial preferences at time t ¼ to
look like this:
\ðl1;to a1 Þ; ðl2;to a2 Þ; ðl3;to a3 Þ; ðl4;to a4 Þ; ðl5;to a5 Þ; . . .; ðli;to ai Þ [ ;
where li;to , to is the preference coefficient (weight) of the ai-th product at time t = to
And let at the moment of time t ¼ t þ o , any feedback on the business process under
consideration on the Internet, i.e. buyers wrote to the Internet. Then, as a result of
studying the opinion of the population, the preference ratio at time t ¼ ti may become
different:
where li;ti is the preference coefficient (weight) of the ai-th product at the moment of
time t ¼ ti .
The value of the weight of products and resources varies depending on market
conditions, depending on the change in taste and preference of users of products of the
business process.
Thus, the current weight value is defined as follows:
li ¼ linitial
i þ ji lcorr
i ;
where lcorr
i ¼ bi lton
i or li
corr
¼ lton
i þ bi li ; bi is the correction factor;
ton
li –weight (or preference) of the product (resource, solution implementation vari-

ant) for which the choice is made;
linitial
i –the initial value is the weight of the product ai , the value of which is assigned
during the initial setup (when building the automation system). In a more general case,
the initial value of the parameter li can be designated as lcorr i i ðt0 Þ.
or so lcorr
Emphasizing that a given variable (or a given class of variables) is adjustable.
lton
i is a quantity (coefficients) determined by analytic analysis of inverse infor-
mation and synthesis or formation of a value from the results of analytical processing of
the results of analysis of feedback information. One of the types of inverse information
is tonality, certain tonal data on the product ai .
The values of the variables li are determined by re-assigning the calculated value of
the tonality lcorr
i ¼ Tnðtn Þ þ vðaI Þtotal ¼ Tn ¼ Tnac þ Tnot .
The correction process stops when the equality condition is satisfied

i ðtn Þ ¼ li ðtn þ 1 Þ or li ðtn Þ\li ðtn þ 1 Þ, where li ðtn Þ is the value of li
lcorr corr corr corr corr corr
at
the moment of time tn, li ðtn þ 1 Þ is the value of li at time tn þ 1 .
corr corr
The obtained values of weights are taken into account when developing plans for
the output of the business process.
5 Conclusion
In the work as a tonal expression or approval to assess the quality of the activities and
output of the business process are taken reviews, which gives individual representatives
of the consumer business process. But these expressions (reviews) are estimates that are
given by individuals, i.e. is the taste of products, determined by individual consumers
of products of the business process. To transform individual quality estimates into an
object or overall rating (estimated by a certain group of consumers), the following three
algorithms were developed: processing and summarizing absolute feedback, relative
feedback and a total assessment of the results of the two algorithms. The following
describes the procedures for recording the results of evaluating the quality of the
activities and products of the business process for drawing up a plan for the output of
products by the business process.
of Kazakhstan (Grant No. 0118PК01084, Digital transformation platform of National economy
business processes BR05236517).
References
1. Anderson, D.J.: Kanban: Successful Evolutionary Change for Your Technology Business.
Published by Blue Hole Press (2010)
2. Duisebekova, K., Serbin, V., Ukubasova, G., Kebekpayeva, Z., Skakova, A., Rakhmetu-
layeva, S., Shaikhanova, A., Duisebekov, T., Kozhamzharova, D.: Design and development
of automation system of business processes in educational activity. J. Eng. Appl. Sci. 8:4702–
4714 Medwell Journals, ISSN:86-949X (2017)
3. Rakhmetulaeva, S.: Using inverse information as a method for assessing and analyzing
reputational risk. J. Vestn. EKSTU 2, 129–133 (2015)
4. Kuandykov, A., Rakhmetulayeva, S., Baiburin, Y., Nugumanova, A.: Usage of singular value
decomposition matrix for search latent semantic structures in natural language texts. In: The
34th Chinese Control Conference and SICE Annual Conference 2015 (CCC&SICE2015),
pp. 286–291. Hangzhou, China (2015). https://ieeexplore.ieee.org/document/7285567. Last
accessed 21 Feb 2019
5. Nugumanova, A., Mansurova, M., Alimzhanov, E., Zyryanov, D., Apayev, K.: An automatic
construction of concet mas based on statistical text mining. In: International Conference on
Data Management Technologies and Applications, pp. 29–38 (2015)
6. Bessmertny, I.: Knowledge visualization based on semantic networks. Program. Comput.
Softw. 36(4), 197–204 (2010)
Energy and Water Management
Customer Clustering of French
Transmission System Operator (RTE)
Based on Their Electricity Consumption
Gabriel Da Silva1 , Hoai Minh Le2(B) , Hoai An Le Thi2 , Vincent Lefieux1 ,

and Bach Tran2
1
French transmission system operator (RTE), Paris, France
{gabriel.da-silva,vincent.lefieux}@rte-france.com
2
Computer Science and Applications Department, LGIPM, University of Lorraine,
Metz, France
{minh.le,hoai-an.le-thi,bach.tran}@univ-lorraine.fr
Abstract. We develop an efficient approach for customer clustering of

French transmission system operator (RTE) based on their electricity
consumption. The ultimate goal of customer clustering is to automat-
ically detect patterns for understanding the behaviors of customers in
their evolution. It will allow RTE to better know its customers and con-
sequently to propose them more adequate services, to optimize the main-
tenance schedule, to reduce costs, etc. We tackle three crucial issues in
high-dimensional time-series data clustering for pattern discovery: appro-
priate similarity measures, efficient procedures for high-dimensional set-
ting, and fast/scalable clustering algorithms. For that purpose, we use the
DTW (Dynamic Time Warping) distance in the original time-series data
space, the t-distributed stochastic neighbor embedding (t-SNE) method
to transform the high-dimensional time-series data into a lower dimen-
sional space, and DCA (Difference of Convex functions Algorithm) based
clustering algorithms. The numerical results on real-data of RTE’s cus-
tomer have shown that our clutering result is coherent: customers in
the same group have similar consumption curves and the dissimilarity
between customers of different groups are quite clear. Furthermore, our
method is able to detect whether or not a customer changes his way of
consuming.
Keywords: Electricity management · Clustering · High-dimensional

time-series data · DTW · t-SNE · DCA based clustering
1 Introduction
RTE (French transmission system operator) is in charge of the high voltage grid
for electricity in France. As a smart grid, RTE is responsible for the balance of
production and consumption, the safety of transportation and the quality of the
delivered services. Two thirds of the French industry is connected to the high
https://doi.org/10.1007/978-3-030-21803-4_89
894 G. Da Silva et al.
level voltage grid and the main business objective of RTE is to ensure customer
satisfactions by improving its services. In an evolving context with digitalization
and the rise of new technologies, RTE keeps the pace of innovation through its
R&D department and collaborations with research institutions. Projects, like
customer clustering, will grant that the RTE’s high voltage grid is relevant in
the future as a public asset. One of main activities of RTE customer relationships
managers is to analyse customer’s data in order to better know the customers and
to enhance relationships. Traditionally, RTE customer relationships managers
visualize the customer’s consumption curves and manually detect their patterns
to understand troubles, changes of behavior, evolution, etc. The task is realized
based on the experience and knowledge of customer relationships managers.
However, such a technique is somewhat biased and a “manual” pattern detection
is clearly not a good method. Hence, there is a need to develop efficient tools
helping customer relationships managers to realize their tasks.
In this work, we develop an efficient approach for clustering (automatic clas-
sification) RTE’s customers based on their electricity consumption. Each RTE’s
customer is characterized by his electricity consumption curves which contain
the electricity consumption of each 10 m over two years. Hence, each customer
is represented by a time-series sequence of 105, 120 points. We are undoubtedly
facing a very large-scale time-series clustering problem.
Clustering is a fundamental machine learning task and has numerous appli-
cations in various domains. Clustering consists in dividing a set of data objects
into “homogeneous” groups (clusters) such that objects in the same group are
more “similar” to each other than to those in other groups. The main objec-
tive of our customer clustering task is to automatically detect patterns and find
casualties of customers in their evolution. The results will help RTE to better
know its customers and to propose them more adequate services. To speak in
marketing terms, they will allow a better adaption of maintenance schedule, a
smooth preparation for real-time operations and a cost reduction. Understanding
more precisely the behaviors of customers on the grid is one more step towards
a smart grid.
In recent years, due to an exponential growth of the time-series data applica-
tions in emerging areas such as sale data, finance, weather, . . . , there have been
considerable research and developments in time-series clustering. Time-series
clustering is a hard task due to the following two main difficulties. The first lies
in the nature of temporal information in time-series. More precisely, while eval-
uating the similarity between time-series objects, the chosen similarity measure
should be able to take into account the temporal information of the consid-
ered time-series data. The second main difficulty concerns the high-dimensional
nature of time-series data. A high number of dimensions leads to great increases
in the computation time of clustering algorithms. Furthermore, clustering tech-
niques often suffer from the “curse of dimensionality” phenomenon, say the
quality of clustering algorithms is degraded as the dimension of data increases.
Hence, for developing an efficient time-series clustering algorithm, one has to deal
with three important issues: appropriate similarity measures for time-series data,
Customer Clustering of French Transmission System Operator 895
efficient procedures for high-dimensional setting, and fast/scalable clustering

algorithms. That is the purpose of our work.
The remainder of the paper is organized as follows. The proposed approach
is developed in Sect. 2 while the experiment and the result analysis are reported
in Sect. 3. Finally, Sect. 4 concludes the paper.
2 The Proposed High-Dimensional Time-Series Data

Clustering Approach
In the development of our solution method, the following questions are crucial
(to which the answers are not independent): which similarity measure to be
considered? which clustering algorithm should be investigated? and how to deal
with high-dimensional data?
2.1 DTW Distance: A Suitable Similarity Measure for Time-Series

Data
Similarity measure is a major challenge in time-series clustering. A suitable
choice of distance measure depends mainly on the objective of clustering task,
and on the characteristic as well as the length of time-series. In this section,
we will study an appropriate similarity measure in high-dimensional time-series
data clustering for pattern discovery. A large number of similarity measures for
time-series data have been proposed in the literature. The readers are referred
to the surveys [2,21] for a more complete list of similarity measures for time-
series data. Generally speaking, they can be divided into two main categories:
lock-step measures (one-to-one) like p -distance and elastic measures (one-to-
many/one-to-none) including DTW distance, Longest Common Sub-sequence
distance, Probability-based distance [21], etc. The one-to-one distances, as the
name suggests, compare the two time-series point by point. However, in prob-
lems where one wants to catch similar patterns of time-series that do not occur
at the same moment, the one-to-one distances are not adapted. The Fig. 1(a)
perfectly illustrates the drawback of one-to-one distance in this case. As we can
see, the Euclidean distance of two time-series is high despite the fact that the
two time-series have the same shape (the second time-series is nothing else but
a horizontally shifted transformation of the first one).
In contrast to one-to-one measures, the main difference and superiority of
elastic measures lie in their ability to handle temporal drift/shifting in time-
series. Among all the existing elastic measures, Dynamic Time Warping (DTW)
distance has been proved to be appropriate in several applications involving
time-series data [1,4,23]. The DTW distance of two time-series x ∈ RTx and
y ∈ RTy can be defined as follows [13]. Let (N, M )-warping path be a sequence
p = (p1 , . . . , pL ) with pl = (nl , ml ) ∈ [1, N ] × [1, M ] for l = 1, . . . , L. A valid
DTW path (N, M ) also needs to satisfy (1) the boundary condition (p1 = (1, 1)
and pL = (N, M ) = (Tx , Ty )); (2) the monitonicity condition (n1 ≤ n2 ≤ · · · ≤
nL and m1 ≤ m2 ≤ · · · ≤ mL ); and (3) the step-size condition (pl+1 − pl ∈
(a) Euclidean Distance (b) DTW distance
Fig. 1. Euclidean and DTW distance of two very-similar time-series (red and blue
curves). The black line shows the matching {pl = (nl , ml )}l=1,...,L between points
between two time-series.
{(1, 0), (0, 1), (1, 1)} for l ∈ {1, . . . , L − 1}). Hence, the DTW distance between
two time-series x and y is given by
L

dDTW (x, y) = min |xnl − yml || p is a valid (N, M)-warping path . (1)
l=1
DTW minimizes the differences between two times-series by aligning each point
from one time-series to the best corresponding points in the other time-series by
a warping path. Hence DTW is enough flexible to handle the shifting between
time-series, and is able to catch the similar patterns of two time-series sequences.
Therefore, the DTW distance is chosen for our time-series clustering approach.
However, the main drawback of DTW distance is its computation time. The
algorithm for computing the DTW distance is a recursive algorithm with com-
plexity of O(T 2 ) where T is the length of time-series. Hence, for high-dimensional
time-series data, it is very slow, even impossible to use directly DTW distance
in clustering algorithms involving the computation of DTW at each iteration
(e.g. k-means and its variants). To overcome this drawback of DTW, we adopt
a feature-based clustering approach. Clustering time-series is usually tackled by
two approaches: the raw-data-based, where clustering is directly applied over
time-series vectors without any space-transformation previous to the clustering
phase, and the feature based approach which does not directly perform clustering
on the time-series raw data.
2.2 A Feature-Based Clustering Approach for High-Dimensional

Time-Series Data
Feature-based clustering approach consists of two main steps: (a) transform the
time-series into feature vectors in a smaller dimensional space, then (b) perform
clustering algorithm on the transformed data. We will describe below an efficient
data transformation algorithm to the first step.
Data transformation by t-SNE In the literature, several data transformation
algorithms for time-series data have been developed. The most well-known one
is certainly the Fourier and Wavelet transformation. The Fourier transformation
decomposes the time-series into a sum of sinusoids, which allows us working in
frequency-domain instead of raw data; whereas Wavelet transformation uses a

different basic function (not necessarily sinusoids). Recently, Schafer et al. [18]
proposed Bag-of-SFA-Symbol (BOSS), an advanced feature extraction algorithm
for time-series, which combines Fourier transformation on sliding windows with
Bag-of-words to extract the characteristic from the time-series. The authors have
shown that BOSS performed better than existing transformations [3,18].
In this work, we will consider t-distributed stochastic neighbor embedding (t-
SNE), a relative new algorithm based on a completely different idea than other
classical algorithms like Principal Component Analysis (PCA), Non-negative
Matrix Factorization (NMF) and above-mentioned algorithms. t-SNE was first
introduced by Maateen et al. [12] as a dimensional reduction algorithm for data
visualization. The t-SNE transforms data-points from the original space into a
new space (normally lower-dimensional space) such that the probability of two
data-points be in the same cluster in the original space is equal to the probability
that their transformations be in the same cluster in the new space. The t-SNE
problem can be formulated as follows. Let {a1 , . . . , aN }, ai ∈ RT be the data-
points in original space and doriginal be the similarity measure in original space.
t-SNE aims to find N points {x1 , . . . , xN } in new space RD (T D) that
preserve the pair-wise similarity of ai . The similarity of two points in new space
is measured by dnew . Let P ∈ RN ×N be the similarity pairwise matrix of ai
p +p
whose each component pi,j (i = 1, . . . , N ; j = 1, . . . , N ) is defined as i|j2N j|i
where pi|j (resp. pj|i ) is the conditional probability that ai (resp. aj ) would pick
aj (resp. ai ) as its neighbor. pj|i is computed as
exp(−doriginal (ai , aj )/2σi2 )

pj|i = 2 , (2)
k=i exp(−doriginal (ai , ak )/2σi )
where σi is the variance of the Gaussian with the mean ai . Similarly, we define
j=1,...,N
Q := (qi,j )i=1,...,N , the similarity pairwise matrix of xi in new space, where
(1 + dnew (xi , xj ))−1

qi,j = −1
.
k=l (1 + dnew (xl , xk ))
Then, t-SNE is formulated as the minimization of the Kullback−Leibler diver-

gence (KL) between P and Q:
⎧ ⎫
⎨ pij ⎬
min F (x) = KL(P ||Q) = pij log . (3)
xi ∈RD ,i∈{1,...,N } ⎩ qij ⎭
i=j
The problem (3) is a non-convex optimization problem for which several algo-
rithms have been developed [8,12,22].
t-SNE offers the liberty to choose the similarity measures doriginal (resp. dnew )
in original (resp. new space). In the original work [12], the authors applied t-SNE
to an application in data visualization where the number of dimensions in the
new space is low (2 or 3), with both doriginal and dnew are Euclidean distance. In
our case, we consider doriginal as the DTW distance while the Euclidean distance
is chosen for dnew . On the one hand, it is obvious that doriginal should be DTW
distance since it is well adapted for time-series data as we have shown in Sub-
Section 2.1. On the other hand, the choice of Euclidean distance for dnew is
motivated by the existence of several efficient, scalable and robust clustering
algorithms based on DC (Difference of Convex programming) and DCA (DC
Algorithm) via the Euclidean distance (e.g. DCA-MSSC [6], DCA-KMSSC [7]).
For a complete study of DC programming and DCA the reader is referred to
[10,11,15,16] and references therein. To the best of our knowledge, this is the
first time t-SNE is used with DTW distance.
The time-series data are now transformed into a new space by t-SNE. In the
next Section, we will study some efficient clustering algorithms for the trans-
formed data.
Fast and scalable DCA based clustering algorithms As we have men-
tioned previously, in our problem, except the consumption curves, we do not
have any information on the clusters, nor the number of clusters. Finding the
number of clusters is challenging for clustering tasks. Generally, there exist two
approaches. The first approach consists in finding firstly the number of clusters
with a “simple” procedure then apply a clustering algorithm with the number of
clusters found previously. In the second approach, one simultaneously determines
the number of clusters and clustering assignment.
In the literature, several algorithms have been developed for finding the num-
ber of clusters. For instance, Elbow algorithm which uses the WSS criterion
(“total within-cluster sum of square”) to determine the number of clusters. The
number k ∗ of clusters is optimal if the corresponding WSS does not change
significantly when increases the number of clusters by 1. Silhouette Average
algorithm is similar to Elbow algorithm. This algorithm varies the number of
clusters and chooses the one that maximizes the Silhouette criterion. Gap Statis-
tic algorithm [20] is another variant of Elbow algorithm. Gap Statistic algorithm
maximize “gap statistic” criterion, which is defined by the difference between the
measured WSS and its expected value under some conditions.
Assuming that the number of clusters is known, there exists a variety of
Euclidean-based clustering algorithms such as k-means, k-medoids, fuzzy c-
means, etc. Among many models for clustering, the Minimum Sum-of-Squares
(MSSC) is one of the most popular since it expresses both homogeneity and
separation. The MSSC problem is described as follows. Given a dataset X :=
(xi )i=1,...,N of N points in RD (i.e. xi ∈ RD are transformations of the time-
series ai in the new space) and a pre-defined number of clusters K. We aim to
to find K points U := (ui )i=1,...,K in RD , known as “centroid”, and assign each
data point in X to its closest centroid ui . MSSC minimizes the sum of squared
distances from the data-points to the centroid of their cluster, as presented in (4).
N

1
min FMSSC (U ) = min ul − xi 22
(4)
U ∈RK×D 2 i=1 l=1,...,K
Le Thi et al. [6] have developed DCA-MSSC, an efficient algorithm based on

DC (Difference of Convex) programming and DCA (DC Algorithm) for solving
the MSSC model. DCA-MSSC have shown its superior in comparison with state-
of-the-art algorithms: performance, robustness, and adaptation to different types
of data. We refer to the original paper [6] for more details of DCA-MSSC algo-
rithm.
On the other hand, among the algorithms that simultaneously determines
the number of cluster and clustering assignment, mclust [19] is a well-known
one. mclust uses the Gaussian Mixture Model. The optimal number of segments
K ∗ is determined by the Bayesian Information Criterion (BIC) and Integrated
Complete-data Likelihood (ICL). In a different direction, Le Thi et al. [9] have
proposed the DCA-Modularity algorithm. DCA-Modularity transforms the data-
points into a graph then segments vertices of the graph using the modularity cri-
terion as a measure of clustering quality. The problem of maximization of graph
modularity is summarized as follows. Consider an undirected unweighted net-
work G = (V, E) with N nodes (V = {1, . . . , N }) and M edges (M = Card(E)).
Denote by the adjacency matrix A: ai,j = 1 if (i, j) ∈ E, 0 otherwise. The degree
N
of node i is denoted ωi (ωi = j=1 ai,j ), and ω stands for the vector whose com-
ponents are ωi . Let P be a partition of V, and K is the number of communities
in P. Define the binary assignment U = (ui,k )k=1,...,Ki=1,...,N , says ui,k = 1 if vertex
i belongs to community k and 0 otherwise. Then, the modularity maximization
problem can be written as
1
N K
max Q(U ) := 2M i,j=1 bi,j k=1 ui,k uj,k , (5)
U
K
s.t. k=1 ui,k = 1, for i = 1, . . . , N ;
ui,k ∈ {0, 1}, for i = 1, . . . , N, k = 1, . . . , K;
where B := (bi,j )i,j=1,...,N = A − 2M

1
ωω T is a constant matrix, called the modu-
larity matrix. The problem (5) is a mixed-binary optimization problem for which
Le Thi et al. [9] have proposed DCA-Modularity. Le Thi et al. [9] have proved
that DCA-Modularity is able to give the right number of clusters as well as
a good clustering assignment on several benchmark datasets. The readers are
referred to the original paper [9] for more details of DCA-Modularity algorithm.
Motivated by the success of DCA-MSSC and DCA-Modularity, we will adopt
both of them for our clustering method. Precisely, DCA-Modularity will be used
to determine the number of clusters and a good staring clustering assignment.
Based on the clustering assignment given by DCA-Modularity, DCA-MSSC focus
on improving the clustering assignment. This combination allows us to take
advantage of both DCA-MSSC and DCA-Modularity. According to all above in-
depth studies, we are going to describe below our solution method for clustering
of RTE’s customers based on their electricity consumption.
2.3 Description of the Main Algorithm

Our proposed method (c.f. Fig. 2) consists of several steps: data processing, data
transformation and clustering. The first step, data processing, deals with the
noisy and outlier data caused by erroneous measurement at electric meter. In the
second step, data transformation, we transform the high-dimensional time-series

data into a lower dimensional space using the t-SNE algorithm with DTW dis-
tance. As for the clustering algorithm, we combine the DCA-Modularity [9] and
DCA-MSSC [6]. DCA-Modularity [9] is used to compute the number of clusters
as well as a good starting point for the clustering algorithm DCA-MSSC. Since
we are interested in detecting similar patterns of time-series data, the transfor-
mation must be scaling and translation invariant. Hence, the z-normalization [5]
is employed for normalizing time-series data. The effectiveness of this well-known
transformation has been proved in several works [5,14,17]. Given a time-series
a of length Ta : (at )t=1,...,Ta , the z-normalization of a is computed as
at − ā
anorm
t = for t ∈ {1, . . . , Ta } (6)
σ(a)
1
Ta
where ā and σ(a) are mean and standard-deviation of a: ā = t=1 at and
Ta
Ta
σ(a) = T1a t=1 (at − ā).
Fig. 2. The proposed time-series clustering method’s pipeline.
The proposed method for customer clustering is summarized in Algorithm 1.
Our experiments are realized on a dataset which contains 462 customer’s elec-
tricity consummation curves of RTE. For a confidential reason, each client is
named by a randomly generated number. The consumption curve contains the
electricity consumption of each 10 minutes over two years (from 01 January 2016
to 31 December 2017).
The code was written in C# 4.7.1. All experiments are conducted on an
Intel(R) Xeon(R) E5-2630v4 (40 CPUs) with 32 GB of RAM.
Experiment 1: We first analyze the relevance of our clustering result. For this
purpose, we perform the Algorithm 1 on the whole dataset. It is worth to note
that the computation time of our method is only 42 m in total. It comes out that
the number of clusters determined by our algorithm is 19. In Fig. 3, we report
the number of customers in each cluster.
Due to the limitation of paper’s length, we only analyze the result from three
clusters. The choice is solely based on the number of customers in each cluster:
the biggest cluster (cluster C7), followed by a medium-size one (cluster C3, which
Algorithm 1 Proposed algorithm for clustering time-series

Input: N time-series ai , . . . , aN .
Output: Clustering assignment p∗ = (p∗i )i=1,...,N where p∗i is the cluster of ai .
Step 1: Z-Normalization transformation.
Input: N time-series a1 , . . . , aN in RT .
Output: N normalized time-series a¯1 , . . . , a¯N in RT .
Step 2: t-SNE DTW-Euclidean transformation.
Input: N normalized time-series a¯i , . . . , a¯N in RT .
Output: N vectors x1 , . . . , xN in new space RD where xi is the corresponding
transformation of ai in the new space.
Step 3: Clustering algorithm
Step 3.1: DCA-Modularity clustering [9].
Input: N vectors x1 , . . . , xN in RD .
Output: The number of clusters K ∗ , a clustering assignment p0 = (p0i )i=1,...,N .
Step 3.2: DCA-MSSC clustering [6].
Input: N vectors x1 , . . . , xN in RD , the number of clusters K ∗ , the clustering
assignment p0 = (p0i )i=1,...,N .
Output: Clustering assignment p∗ := (p∗i )i=1,...,N .
Fig. 3. Number of customers by clusters.
is 25% smaller than cluster C7) and one small-size cluster (cluster C17, which
is the 4th smallest cluster). The consumption curve of four arbitrarily chosen
customers from each cluster are presented in Fig. 4 (cluster C3), Fig. 5 (cluster
C7) and Fig. 6 (cluster C17).
We observe that customers in each cluster clearly have similar shapes. For
cluster C7, customers tend to have a “regular” consumption pattern: the con-
sumption is high and followed by a short “drop”, i.e. the consumption suddenly
tumbles to a small value, in comparison with the consumption level of previous
period), which repeats during the year. Further analysis reveals that this is a
typical “weekly” consumption pattern, where the drops often happen during the
weekend. In addition, they also have a long drop in consumption for 2–3 weeks
around the middle of August, and a shorter drop at the end of the year. For
cluster C17, as we can see, all four customers have a low electricity consump-
tion (all are around 0), despite the differences in their maximum consumption.
They frequently generate very high peaks in a short duration during the whole
year. Customers in cluster C3 have a stable consumption during the whole year
(mostly varies around a base) and rarely have long “drop” during the year. They
Fig. 4. Some consumption curves of customers from cluster C3.
often have short drops in consumption (as oppose to short “peaks” in cluster
C17).
From the Figs. 4, 5 and 6, we can conclude that (1) the consumption curve of
customers in the same clusters are coherent and (2) the differences of customers
between clusters are quite clear.
Experiment 2: In this experiment, we are interested in the capacity of our
method to detect if a customer changes his way of consuming. For this purpose,
we apply a Sliding Window technique with slide duration of four weeks, thus we
obtain 13 different one-year-window datasets. Each one-year window datasets is
processed by Algorithm 1.
We now analyze a customer whose consumption behavior changes over time.
Consider the case of customer 10010. As we can see in Fig. 7, this customer
has a sharp drop in consumption in August 2015 while there is a much smaller
decline in August 2016. This customer has therefore clearly changed his mode of
consumption. In Fig. 8, we show the clusters to which customer 10010 belongs
during the 12 monthly runs of our segmentation algorithm. We see that up to
the 12/08/2016, customer 10010 belongs to cluster C2. As it was detected by
our algorithm, this customer changes his mode of consumption in August 2016.
By 09/09/2016, customer 10010 is assigned to a new cluster (cluster C7).
In Figs. 9 and 10, we show some customers in cluster C2 and cluster C7. We
observe the similarities between the load curve of customer 10010 from August
12th, 2015 to August 11th, 2016 (Fig. 7a) and those of customers in cluster C2
(Fig. 9). This same remark is valid between customer 10010’s consumption curve

(a) August 12th, 2016 (b) September 09th, 2016
Fig. 7. The consumption of customer 10010 from 12/08/2015 to 11/08/2016 (left

figure) and from 09/09/2015 to 08/09/2016 (right figure).
Fig. 8. The cluster of customer 10010 during nine runs. The first four runs’ results
are also C2, thus it is cropped out for visibility reason. The date (i.e. 22/04/2016)
represents the staring date of the one-year window.
Fig. 9. Some customers from cluster C2 on 12/08/2016.
Fig. 10. Some customers from cluster C7 on 09/09/2016.
from September 9th, 2015 to September 8th, 2016 (Fig. 7b) and other customers
of cluster C7 (Fig. 10).
4 Conclusion
This approach is the result of in-depth studies using advanced theoretical and
algorithmic tools for large-scale time-series data clustering. We have efficiently
tackled the three challenges for our time-series clustering task: the similarity
distance measure, the clustering algorithm, and the Big data. The innovative
character intervenes in all stages of the proposed approach: the data transfor-
mation via t-SNE with the DTW measure in the original data space and the
Euclidean distance in the transformed space is original. Indeed, on one hand,

the DTW measure is appropriate to time-series clustering for pattern discov-
ery. In another hand, the use of Euclidean distance in the transformed space
allows us to investigate efficient clustering algorithms based on this distance.
This transformation is very efficient to confront the high-dimensional data. For
the first time the DTW distance is considered in the t-SNE transformation model
and the resulting DCA scheme is particularly effective. Last but not least, the
combination of two powerful clustering algorithms - DCA-Modularity and DCA-
MSSC - is interesting, which gives rise to a performant method of clustering. It
goes without saying that such results are possible thank to DC programming
and DCA [10,11,15,16], powerful tools of nonconvex optimization and machine
learning.
From a business point of view, this clustering has led to results that are
already interpretable and useful for RTE. The possibility of monitoring poten-
tial customer developments is of particular interest to customer relationships
managers, who can thus offer customized services.
Acknowledgment. This research is part of the project “Smart Marketing” founded

by RTE in collaboration with Computer Science and Applications Department,
LGIPM, University of Lorraine, France.
The authors would like to thank Mr Romain Gemignani for his contributions to the
starting step of the project. We thank also Dr Duy Nhat Phan for his discussion on
the use of t-SNE transformation.
References
1. Aach, J., Church, G.M.: Aligning gene expression time series with time warping
algorithms. Bioinformatics 17(6), 495–508 (2001)
2. Aghabozorgi, S., Seyed Shirkhorshidi, A., Ying Wah, T.: Time-series clustering -
a decade review. Inf. Syst. 53, 16–38 (2015)
3. Bagnall, A., Lines, J., Bostrom, A., Large, J., Keogh, E.: The great time series
classification bake off: a review and experimental evaluation of recent algorithmic
advances. Data Min. Knowl. Discov. 31(3), 606–660 (2017)
4. Chu, S., Keogh, E., Hart, D., Pazzani, M.: Iterative deepening dynamic time warp-
ing for time series. In: Proceedings of the 2002 SIAM International Conference on
Data Mining, pp. 195–212. SIAM (2002)
5. Goldin, D.Q., Kanellakis, P.C.: On similarity queries for time-series data: Con-
straint specification and implementation. In: Montanari, U., Rossi, F. (eds.) Prin-
ciples and Practice of Constraint Programming – CP 1995. Lecture Notes in Com-
puter Science, pp. 137–153. Springer, Heidelberg (1995)
6. Le Thi, H.A., Belghiti, M.T., Pham Dinh, T.: A new efficient algorithm based on
DC programming and DCA for clustering. J. Glob. Optim. 37(4), 593–608 (2007)
7. Le Thi, H.A., Le, H.M., Pham, D.T.: New and efficient DCA based algorithms for
minimum sum-of-squares clustering. Pattern Recognit. 47(1), 388–401 (2014)
8. Le Thi, H.A., Le, H.M., Phan, D.N., Tran, B.: A DCA-Like Algorithm and its
Accelerated Version with Application in Data Visualization. arXiv:1806.09620 [Cs,
Math]. p. 8 (2018)
9. Le Thi, H.A., Nguyen, M.C., Pham Dinh, T.: A DC programming approach for
finding communities in networks. Neural Comput. 26(12), 2827–2854 (2014)
10. Le Thi, H.A., Pham, D.T.: The DC (Difference of Convex Functions) programming
and DCA revisited with DC models of real world nonconvex optimization problems.
Ann. Oper. Res. 133(1), 23–46 (2005)
12. Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn.
Res. 9, 2579–2605 (2008)
13. Meinard, M.: Dynamic time warping. In: Information Retrieval for Music and
Motion, pp. 69–84. Springer, Heidelberg (2007)
14. Paparrizos, J., Gravano, L.: K-Shape: efficient and accurate clustering of time
series. In: Proceedings of the 2015 ACM SIGMOD International Conference on
Management of Data, pp. 1855–1870. ACM Press (2015)
15. Pham Dinh, T., Le Thi, H.A.: Convex analysis approach to DC programming:
theory, algorithms and applications. Acta Math. Vietnam. 22(1), 289–355 (1997)
16. Pham Dinh, T., Le Thi, H.A.: A D.C. optimization algorithm for solving the trust-
17. Rakthanmanon, T., Campana, B., Mueen, A., Batista, G., Westover, B., Zhu, Q.,
Zakaria, J., Keogh, E.: Searching and mining trillions of time series subsequences
under dynamic time warping. In: Proceedings of the 18th ACM SIGKDD Interna-
tional Conference on Knowledge Discovery and Data Mining - KDD 2012, p. 262.
ACM Press, Beijing, China (2012)
18. Schäfer, P.: The BOSS is concerned with time series classification in the presence
of noise. Data Min. Knowl. Discov. 29(6), 1505–1530 (2015)
19. Scrucca, L., Fop, M., Murphy, T.B., Raftery, A.E.: Mclust 5: clustering, classi-
fication and density estimation using gaussian finite mixture models. R J. 8, 29
(2016)
20. Tibshirani, R., Walther, G., Hastie, T.: Estimating the number of clusters in a data
set via the gap statistic. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 63(2), 411–423
(2001)
21. Warren Liao, T.: Clustering of time series data–a survey. Pattern Recognit. 38(11),
1857–1874 (2005)
22. Yang, Z., Peltonen, J., Kaski, S.: Majorization-minimization for manifold embed-
ding. In: Artificial Intelligence and Statistics, pp. 1088–1097 (2015)
23. Yi, B.K., Faloutsos, C.: Fast time sequence indexing for arbitrary Lp norms. In:
VLDB, vol. 385, p. 99 (2000)
Data-Driven Beetle Antennae Search
Algorithm for Electrical Power Modeling
of a Combined Cycle Power Plant
Tamal Ghosh1(&), Kristian Martinsen1, and Pranab K Dan2

1
IVB, Norwegian University of Science and Technology, 2815 Gjøvik, Norway
{tamal.ghosh,kristian.martinsen}@ntnu.no
2
RMSoEE, Indian Institute of Technology Kharagpur, Kharagpur 721302, India
pkdan@see.iitkgp.ernet.in
Abstract. Beetle Antennae Search (BAS) is a newly developed nature-inspired

algorithm, which falls in the class of single-solution driven metaheuristic
techniques. This algorithm mimics the searching behavior of the longhorn
beetles for food or potential mate using their long antennae. This algorithm is
potentially effective in achieving global best solutions promptly. An attempt is
made in this paper to implement the data-driven BAS, which exploits the
Cascade Feed-Forward Neural Network (CFNN) training for functional
approximation. The proposed technique is utilized to model the electrical power
output of a Combined Cycle Power Plant (CCPP). The power output of a power
plant could be dependent on four input parameters, such as Ambient Temper-
ature (AT), Exhaust Vacuum (V), Atmospheric Pressure (AP), and Relative
Humidity (RH). These parameters affect the electrical power output, which is
considered as the target variable. The CFNN based predictive model is shown to
perform equivalently while compared with published machine learning based
regression methods. The proposed data-driven BAS algorithm is effective in
producing optimal electric power output for the CCPP.
Keywords: Beetle antennae search algorithm Artificial neural network

Data-driven modeling Combined cycle power plant Surrogate based
optimization
1 Introduction
Beetle Antennae Search (BAS) is a recently proposed technique, which is inspired by

the odour sensing mechanism of beetles using their long antennae [1]. These longhorn
beetles family is substantially large (26,000 species). Antennae works as a sensor with
complex mechanism. Fundamental functions of such sensors are to follow the smell of
the food or to sense the pheromone produced by the potential opposite gender for
reproduction. The beetle moves its antennae in a particular direction to sense the smell
of the food or mates. This movement is random in neighborhood area and directed
according to the concentration of smell. Hence, the beetle turns to right or left
depending upon high concentration of smell or odour data gathered by the antennae

https://doi.org/10.1007/978-3-030-21803-4_90
Data-Driven Beetle Antennae Search Algorithm for Electrical … 907
sensors. These phenomenon is depicted in Fig. 1. Based on this phenomenon, the BAS
algorithm could be synopsized.
Fig. 1. Beetle Search procedure based on odour sensing mechanism using antennae
1.1 Data-Driven Optimization

Data-driven optimization has evolved as a novel category of optimization technique
recently, which exploits small amount of empirical data to approximate objective
functions and eliminate the need of practicing computationally complex mathematical
expressions or expensive experimental runs. It facilitates the use of the traditional or
existing optimization algorithms, such as, the exact methods, the evolutionary and bio-
inspired algorithms as optimal solution searching modules. It uses different types of
prediction or regression tool as surrogate models, such as Artificial Neural Network
(ANN), Response Surface Method (RSM), Radial Basis Function (RBF), Kriging
Model, Support Vector Machine (SVM), and Decision tree etc. Data-driven modeling
is also categorized as the black-box modeling when there is little or no information
available about the processes [2, 3]. These techniques are capable of estimating
functional relationships among process variables based on the sampled data obtained
using Design of Experiment (DOE) techniques [4, 5]. Accuracy of the solution
approximation would be crucial while training the surrogate models. Mean Square
Error (MSE), Root Mean Square Error (RMSE), Mean Absolute Error (MAE) could be
used as performance measures. The lower the performance metric score, the better the
accuracy of the model. Once the data-driven surrogate model is trained, an appropriate
optimization algorithm, e.g. Genetic Algorithms (GA), Particle Swarm Optimization
(PSO), Ant Colony Optimization (ACO), Bat Inspired Algorithm (BA), and African
Buffalo Optimization (ABO) etc. could be employed as the solution search technique
which would be near optimum [6]. Surrogate models are substantially prompt and
efficient therefore, these are computationally inexpensive.
908 T. Ghosh et al.
1.2 Combined Cycle Power Plant (CCPP)

A Combined Cycle Power Plant (CCPP) consists of the gas turbines (GT), the steam
turbines (ST) and the heat recovery steam generators (HRSG). In this system, gas and
steam turbines are the generators of electric power in every cycle. This electric power is
then transferred from one turbine to another [7]. Notable amount of hot exhaust is
produced along with the electrical power by gas turbine in the CCPP. This waste heat is
channelized further through a water-cooled heat exchanger to generate the steam. This
steam could be further processed through a steam turbine and a generator to obtain the
additional electric power. This system is one of the finest examples of the waste
recycling systems. This type of power plants are being evolved more in number and
becoming a topic of interest to the researchers recently. Reference [8] has considered
one such CCPP for their research and collected data from there. The CCPP layout is
portrayed in Fig. 2.
Major process variables of a CCPP are, the ambient temperature (AT), atmospheric
pressure (AP), and relative humidity (RH) for the gas turbine and the exhaust vacuum
(V) for the steam turbine. These parameters are considered as input in the CCPP
dataset. The dataset is available in the UCI Machine Learning Repository (https://
archive.ics.uci.edu/ml/datasets/combined+cycle+power+plant). The produced electrical
power from the CCPP is considered as output response. Possible ranges of the input
and response variables are provided as, AT (1.81–37.11 °C), V (25.36–81.56 cm Hg),
AP (992.89–1033.30 mbar), RH (25.56–100.16%), PE (420.26–49576 MW). The data
file consists of 9568 sample data points spread over six years (2006–2011). Table 1
portrays the statistics of the data.
Table 1. Statistics of the input and output data

AT V AP RH PE
Min 1.81 25.36 992.89 25.56 420.26
Max 37.11 81.56 1033.3 100.16 495.76
Mean 19.65123 54.3058 1013.259 73.30898 454.365
Std. Dev. 7.452473 12.70789 5.938784 14.60027 17.067
2 Research Methodology
In this study, the focus is put on the Multi-Layer Perceptron (MLP) network, which is
used as the fitness evaluating function for BAS. The MLP networks are suitable for
predictive modeling because of their natural ability of finding the correlations among
the random inputs and outputs [9]. MLPs are classified in two categories, (1) Cascade
Feed-Forward Neural Network (CFNN) and (2) Feed-Forward Neural Network
(FFNN). Unlike CFNN, the FFNN does not have any direct connection between inputs
and outputs.
Fig. 2. CCPP Layout [8]

910 T. Ghosh et al.
2.1 Cascade Feed-Forward Neural Network (CFNN)

The MLP variant used in this study, is known as CFNN. The CFNN architecture is
shown in Fig. 3.
Fig. 3. CFNN schematic diagram
It has some direct connections among the inputs and the outputs. It has n input
neurons, m hidden layers neurons, and output neurons. The output equation is shown
as,
!!
X
n X
m X
n
yi ¼ Zik wkj xk þ Zioa woa
ji Zkha wha
jk xk ð1Þ
k¼1 j¼1 k¼1
Where Zoa oa
i is denoted as activation function for ith output yi, wji is the weight from jth
ha
hidden layer neuron to ith output node, Zk is the activation function for jth hidden
layer neuron, whajk is the weight from the kth input to the jth hidden layer neuron, and xk
is the kth input signal. Zki is the activation function and wkj is the weight from the inputs
to outputs. Further, if some bias is added to the input layer, the Eq. (1) becomes,
!!
X
n X
m X
n
yi ¼ Zik wkj xk þ Zioa bi þ woa
ji Zkha bj þ wha
jk xk ð2Þ
k¼1 j¼1 k¼1
Where bi is the weight from the bias to the ith output layer neuron and bj is the weight
from the bias to the jth hidden layer neuron. Zki is the activation function and wkj is the
weight from the inputs to outputs. The network weight in CFNN is approximated based
on the neurons in the input layer.
2.2 Performance Measure

The Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) are utilized as
the performance evaluation metrics for the trained CFNN model. The MAE and RMSE
are the improved metrics, which accurately measure the regression errors [10]. If the
model produces the output response y and the target response is t and i is the index of
experimental run for the machining processes. The MAE and RMSE are calculated
using,
1 rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
X
RMSE ¼ ð yi t i Þ 2 ð3Þ
N i
1X
MAE ¼ jyi ti j ð4Þ
N i
2.3 BAS Algorithm

According to Ref. [11], the BAS algorithm is demonstrated as,
Algorithm 1: BAS algorithm for global minimum searching
Input: Establish an objective function f(xt), where

Variable xt = [x1, …, xi]T , initialize the
Parameters x0, d0, δ0.
Output: xbst, fbst.
While (t < Tmax) or (stopping condition) do

Generate the direction vector unit according to Eq. (5);
Search in variable space with two kinds of antennae according to Eq. (6) or
(7)
Update the state variable xt according to Eq. (8)
if f(xt) < fbst then
fbst = f(xt), xbst = xt.
Update sensing diameter d and step size δ using Eq. (9) and (10)
Return xbst, fbst.
The BAS algorithm is a single solution based metaheuristic technique which is similar
to the Simulated Annealing (SA) algorithm. The BAS starts with a randomly generated
beetle with the position vector xt at tth time instant (t = 1, 2, …, n) and the position is
evaluated using some fitness function f which determines the smell or odour concen-
tration [12]. The beetle would decide to move further based on the smell concentration
by generating the next promising position for it to move. This next position would be
obtained in the neighborhood of the previous position by following some rules. These
912 T. Ghosh et al.
rules are derived from the behavior of the beetle, which includes exploring and
exploiting behavior. The directional move is determined by,
~ rnd ðk; 1Þ
b¼ ð5Þ
krnd ðk; 1Þk
Where, rnd is considered as a random function, and k signifies the input dimensions of
the beetle position. The exploring is performed on right (xr) and left (xl) sides as the
beetle moves similarly using its antennae. The moves could be presented using,
xr ¼ xt þ d t ~
b ð6Þ
xl ¼ xt d t ~
b ð7Þ
Where d is the sensing length of the antennae, which implies the exploiting skill. Value
of d must enfold the solution space which is large enough. This phenomenon could
further guide the algorithm to escape from being stuck at the local optima and improve
the convergence speed. Secondly, to formulate the behavior of detecting, following
iterative model is generated,
xt ¼ xt1 þ dt~
b signðf ðxr Þ f ðxl ÞÞ ð8Þ
Where d is the step size of the exploring mechanism, which follows a decreasing
function of t. sign represents a sign function. Update rules are defined using the
antennae length d and step size d, as,
d t ¼ 0:95d t1 þ 0:01 ð9Þ
dt ¼ 0:95dt1 ð10Þ
The proposed data-driven BAS framework is depicted in Fig. 4. It shows that the
CFNN model is used as a surrogate model to the BAS algorithm, which can evaluate
the candidate solutions effectively. The CCPP module of this framework, is used for
collecting data.
3 Computational Results and Analysis
To validate the proposed data-driven BAS algorithm, the CCPP dataset is used from the
UCI Machine Learning Repository, which is portrayed in Subsect. 1.2. This data is
divided in 70:30 for the training, and testing. Thereafter, 100 data points are randomly
picked for the validation purpose. Levenberg-Marquardt backpropagation is used for
the network training. Parameters for the CFNN are set as, learning rate = 0.1, error
goal = 1e-7, and number of epochs = 1000. The convergence property of the data-
driven BAS algorithm is demonstrated in Fig. 5 for 500 generations. The parameters of
the BAS algorithm are set as d0 = 0.001, and d0 = 0.8. It could be observed that the
proposed algorithm achieves global optimal solution promptly. The best solution
obtained is fbst [PE = 498.2971] with the optimal parameters set = [AT = 7.9415,
V = 42.92045, AP = 1003.937, RH = 51.678] t=253. This solution shows better electric
power output than the best result portrayed in the dataset [AT = 5.48, V = 40.07,
AP = 1019.63, RH = 65.62, PE = 495.76].
Fig. 4. CFNN Assisted Data-Driven BAS framework
Based on the data considered in Ref. [6], input variables are divided in 4 subsets
and the CFNN based predictive model is tested on these. The obtained MAE and
RMSE scores are compared with seven different regression models published previ-
ously. Results are depicted in Table 2. It could be observed that the CFNN model is
clearly better than the published six models out of seven except the one. The CFNN
scores are very close to the best published results with low variance scores overall.
Table 2. Comparison among the CFNN prediction model and published methods [6]
AT AT-V AT-V-AP AT-V-AP-RH Mean Variance
MAE RMSE MAE RMSE MAE RMSE MAE RMSE MAE RMSE MAE RMSE
CFNN 3.92 5.08 3.24 4.25 2.98 4.01 2.92 3.89 3.27 4.31 0.21 0.29
LMS 4.28 5.43 3.91 4.97 3.62 4.58 3.62 4.57 3.86 4.89 0.10 0.17
SMO 4.28 5.43 3.91 4.97 3.62 4.58 3.62 4.56 3.86 4.89 0.10 0.17
K* 4.26 5.38 3.63 4.63 3.36 4.33 2.88 3.86 3.53 4.55 0.33 0.41
BREP 4.07 5.21 3.03 4.03 2.95 3.93 2.82 3.79 3.22 4.24 0.33 0.43
M5R 3.98 5.08 3.42 4.42 3.26 4.22 3.17 4.13 3.46 4.46 0.13 0.19
M5P 3.98 5.09 3.36 4.36 3.23 4.18 3.14 4.09 3.43 4.43 0.14 0.21
REP 4.09 5.23 3.26 4.34 3.21 4.29 3.13 4.21 3.42 4.52 0.20 0.23
914 T. Ghosh et al.
Figure 6 portrays the CFNN regression plots (with R-Values) and the scatter plot of
the CFNN predictive model. The prediction result for electrical power output of a CCPP
is substantially accurate. Due to this accuracy and high R-values obtained during CFF
training, cross-validation is not performed. This is demonstrated based on the actual PE
values and estimated PE values (for the 100 data points obtained randomly from the
dataset). It is observed from Table 2, that the CFNN obtains (MAE = 2.919 and
RMSE = 3.895) scores for the subset of all four parameters and very close to the BREP
scores (MAE = 2.818 and RMSE = 3.787). Therefore, this could be concluded that the
CFNN based predictive model is an efficient tool for electric power prediction in the
CCPP. This could be further used as a tool to forecast the accurate power output for the
next hours or days for the CCPP. Thereafter, the BAS algorithm is depicted as an efficient
data-driven optimization tool, which could select the right set of the process parameters
and optimal level of the electric power output. This approach could be employed to
increase the efficiency of CCPP. This further proves that the BAS is capable of achieving
the near-optimal solution even when the specific objective function is not available and
the process is solely dependent on the empirical process data.
Fig. 5. Convergence plot for data-driven BAS Algorithm
Fig. 6. Regression plots and curve fitting plots (Scatter plot) of four parameter subset by CFNN
4 Conclusions
This article proposes a novel data-driven CFNN assisted BAS algorithm for optimal
power output of the CCPP. The BAS is a latest metaheuristic algorithm in the category
of the single solution based metaheuristics, which mimics the searching behavior of the
longhorn beetles. The CFNN network is used as the predictive model for output
approximation for the CCPP. The proposed technique is successfully tested on the
CCPP dataset published in the UCI Machine Learning Repository. The conclusions are,
the CFNN model is competitive and can produce outputs with very low MAE and
RMSE scores, the BAS algorithm is substantially efficient and capable of producing
optimal parameter sets and output of the CCPP, and The CFNN assisted BAS produces
next hour/day/month prediction accurately and enhances the efficiency of the
CCPP. This technique could be further extended for various engineering process
modelling and could be compared with the other standing metaheuristics in future.
References
1. Jiang, X., Li, S.: BAS: beetle antennae search algorithm for optimization problems (2017).
arXiv:1710.10724 [cs.NE]
2. Simpson, T.W., Toropov, V., Balabanov, V., Viana, F.A.C.: Design and analysis of
computer experiments in multidisciplinary design optimization: a review of how far we have
come–or not. In: 12th AIAA/ISSMO Multidisciplinary Analysis and Optimization
Conference (2008)
3. Beykal, B., Boukouvala, F., Floudas, C.A., Pistikopoulos, E.N.: Optimal design of energy
systems using constrained grey-box multi-objective optimization. Comput. Chem. Eng. 116,
488–502 (2018)
4. Garud, S.S., Karimi, I.A., Kraft, M.: Design of computer experiments: a review. Comput.
Chem. Eng. 106, 71–95 (2017)
5. An, Y., Lu, W., Cheng, W.: Surrogate model application to the identification of optimal
groundwater exploitation scheme based on regression kriging method-a case study of
Western Jilin Province. Int. J. Environ. Res. Public Health 12(8), 8897–8918 (2015)
6. Messac, A.: Optimization in Practice with MATLAB. Cambridge University Press, NY,
USA (2015)
7. Niu, L. X.: Multivariable generalized predictive scheme for gas turbine control in combined
cycle power plant. In: IEEE Conference on Cybernetics and Intelligent Systems (2009)
8. Tüfekci, P.: Prediction of full load electrical power output of a base load operated combined
cycle power plant using machine learning methods. Electr. Power Energy Syst. 60, 126–140
(2014)
9. Arnaiz-González, Á., Fernández-Valdivielso, A., Bustillo, A., De Lacalle, L.N.L.: Using
artificial neural networks for the prediction of dimensional error on inclined surfaces
manufactured by ball-end milling. Int. J. Adv. Manuf. Technol. 83, 847–859 (2016)
10. Willmott, C.J.: On the Validation of Models. Phys. Geogr. 2(2), 184–194 (1981)
11. Zhu, Z., Zhang, Z., Man, W., Tong, X., Qiu, J., Li, F.: A new beetle antennae search
algorithm for multi-objective energy management in microgrid. In: 13th IEEE Conference
on Industrial Electronics and Applications (ICIEA), pp. 1599–1603 (2018)
12. Wang, J., Chen, H.: BSAS: beetle swarm antennae search algorithm for optimization
problems (2018). arXiv:1807.10470 [cs.NE]
Finding Global-Optimal Gearbox Designs
for Battery Electric Vehicles
Philipp Leise(B) , Lena C. Altherr, Nicolai Simon, and Peter F. Pelz
Chair of Fluid Systems, TU Darmstadt, Otto-Berndt-Str. 2,

64287 Darmstadt, Germany
{philipp.leise,lena.altherr,peter.pelz}@fst.tu-darmstadt.de,
nicolai.simon@stud.tu-darmstadt.de
Abstract. In order to maximize the possible travel distance of battery

electric vehicles with one battery charge, it is mandatory to adjust all
components of the powertrain carefully to each other. While current vehi-
cle designs mostly simplify the powertrain rigorously and use an electric
motor in combination with a gearbox with only one fixed transmission
ratio, the use of multi-gear systems has great potential. First, a multi-
speed system is able to improve the overall energy efficiency. Secondly,
it is able to reduce the maximum momentum and therefore to reduce
the maximum current provided by the traction battery, which results
in a longer battery lifetime. In this paper, we present a systematic way
to generate multi-gear gearbox designs that—combined with a certain
electric motor—lead to the most efficient fulfillment of predefined load
scenarios and are at the same time robust to uncertainties in the load.
Therefore, we model the electric motor and the gearbox within a Mixed-
Integer Nonlinear Program, and optimize the efficiency of the mechanical
parts of the powertrain. By combining this mathematical optimization
program with an unsupervised machine learning algorithm, we are able
to derive global-optimal gearbox designs for practically relevant momen-
tum and speed requirements.
Keywords: Powertrain · Gearbox · Optimization · BEV · WLTP ·

MINLP · Gaussian mixture model · Piecewise linearization
1 Introduction
Battery electric vehicles (BEV) become more and more important. The major
drawback of BEV in comparison to cars with an internal combustion engine
is still a shorter travel distance [9]. Therefore it is important to increase the
overall efficiency of the complete powertrain, i.e. of all vehicle components used
to transform stored electric energy into kinetic energy. This includes the engine
(electric motor) and the drivetrain, consisting of the transmission (gearbox), the
drive shafts, the differential, and the final drive (drive wheels). In our paper,
we focus on the interplay between the electric motor and the gearbox. The use
https://doi.org/10.1007/978-3-030-21803-4_91
Finding Global-Optimal Gearbox Designs for Battery Electric Vehicles 917
of gearboxes with multiple gears is promising, as it may increase the efficiency

of the powertrain. Another major advantage of using a multi-gear system is a
reduced maximum momentum, which results in lower currents drawn from the
traction battery. Therefore, automotive experts expect the usage of multi-gear
gearboxes in BEV in the future [19].
While experience-based design principles are applied in gearbox development,
cf. e.g. [27, p. 686 ff.], we propose to extend the design process by an algorithmic
approach. Like this, we are able to consider not only known designs and slight
variations thereof, but the full solution space of gearbox designs within certain
degrees of freedom. Moreover, our approach allows to automatically find a gear-
box design that optimally matches a given electric motor and yields highest effi-
ciency. Employing Mixed-Integer Nonlinear Programming (MINLP) techniques,
cf. [3], it is possible to compute global-optimal designs using state-of-the-art
solvers, like SCIP or Baron, as shown in [5].
2 Related Work
The optimization of powertrains and their parts is a major research area. Espe-
cially the optimal design of gearboxes has been investigated in the literature.
Because of the combinatorial nature of the underlying mathematical problem,
the general optimization task is highly complex. Therefore, mostly heuristic opti-
mization methods have been applied to derive optimized solutions, cf. [15,20,21].
An approach for finding global-optimal solutions is shown in [1,8]. The authors
use a MINLP formulation to derive gearbox designs with a minimum size and
a minimum number of switching devices, as they have a major impact on the
manufacturing costs.
A concept for optimizing all relevant powertrain components (battery,
inverter, electric motor, and gearbox) is presented by [22], however no equa-
tions are shown. References [10,24] investigate the economic and dynamic per-
formance benefits of connecting the electric motor of BEVs with a two-speed
transmission, while optimizing the gear ratio using dynamic programming and
an heuristic approach, respectively. Since in both works specific load cycles are
used, none of these approaches are robust against uncertainties in the load, i.e.
are able to guarantee a working system for loads that are different than expected.
In the following, we present a new systematic methodology to derive
optimally-matched gearboxes for an electric motor. We optimize the efficiency
and dimension of a gearbox given a set of load scenarios. To ensure a work-
ing system for a whole range of load points, besides the scenarios considered
for calculating the objective value, we use additional constraints to improve the
robustness.
3 Materials and Methods

Within this section, we describe the relevant materials and methods to derive
an optimized powertrain layout for BEV. The considered part of the powertrain
918 P. Leise et al.
consists of one electric motor combined with one multi-gear gearbox. Power-
trains with multiple motors and other components like the traction battery or
the inverter are not considered, neither are the effects of recuperation on the
efficiency of electric vehicles, as they mainly affect the battery size and not the
transmission system. We refer to [12] for details on this topic.
3.1 Generation of Load Scenarios

In order to derive load demands relevant to practice, we refer to the world-
wide harmonized light vehicles test procedure (WLTP) class 3, cf. [25], which
is applied for passenger cars. Based on the WLTP driving cycle, the momen-
tum and angular velocity requirements can be calculated using a 0-dimensional
physical model as shown in [13]. Employing a longitudinal model for passen-
ger cars, we derive the data points depicted in Fig. 1. To consider uncertainty
in the load, we model our optimization problem as a two-stage stochastic pro-
gram, in which we consider multiple load scenarios with certain probabilities
of occurrence and optimize the expected value. For a more detailed view on
scenario-based optimization, we refer to [23]. To derive these load scenarios, we
use an Expectation-Maximization clustering algorithm to fit a Gaussian Mixture
model, cf. [4], with multiple centers, and restrict the solutions to clusters, where
each Gaussian distribution has the same covariance matrix. To ensure clusters
with a desired shape, we use the Davies-Bouldin Index, cf. [6]. We set the mini-
mal number of clusters to four, and only allow a Davies-Bouldin Index above 1.
Without this restriction it is possible to have a high correlation between center
coordinates in one dimension. The final clustering with six centers is shown in
Fig. 1. Each center point corresponds to one of the six load scenarios, which are
all assigned a probability of occurrence of 61 . For ensuring a robust design, we
additionally use the quickhull algorithm [2] to generate a convex hull of all load
points in Fig. 1. In our model, we ensure that all load points corresponding to
corner points of the convex hull are fulfilled by the powertrain.
3.2 Mathematical Gearbox Model

All sets, variables and parameters of the presented MINLP problem are shown
in Table 1. In the following, we derive the constraints of the MINLP model based
on technical restrictions. The most important measure in context of gears is the
module M which is the basis for all other gearbox dimensions. The module of
a gear wheel is given by the ratio of its diameter to its number of teeth. It can
be regarded as the unit of size that indicates how big or small a gear wheel is.
To ensure a functioning system, all gear wheels g in a gearbox must have the
same module. The possible module values are listed in standard DIN 780, cf.
[7]. In our computations we set M = 3 mm. The dimension of a specific gear
wheel combination can be calculated using the module and the number of teeth
of engine and output gear wheel, zg,1 and zg,2 :
M
d= (zg,1 + zg,2 ) ∀g ∈ G. (1)
2 cos β
1,200
1,000
800
T W in kg m2 s−2
600
400
200
0
0 20 40 60 80 100 120 140
W −1
Ω in s
Fig. 1. Final clustering result with 6 clusters. The centers are shown with filled boxes
marked with an ×. To ensure a robust design, we also generate the convex hull com-
prising all load points, here depicted as black circles linked with straight lines.
β is the angle of inclination, which is set to 20◦ based on [27, p. 726]. The
dimension d is the same for all gears g in the gearbox, since they use the same
shafts. The considered multi-gear design uses gear-dependent transmission ratios
ig . We restrict the spread of gears by the following constraints:
ig − ig−1 ≤ S max ∀g ∈ G \ {1}, (2a)
ig−1 − ig ≤ S max ∀g ∈ G \ {1}. (2b)

The transmission ratio of each gear is equivalent to the ratio of the teeth of
the two meshing gear wheels, and should be between 0 and 6 according to [27,
p. 738]:
zg,2
0 ≤ ig = ≤6 ∀g ∈ G. (3a)
zg,1
920 P. Leise et al.
Table 1. MINLP model notation.
Set Description
G Set of possible gears
K Convex hull of load points in Fig. 1
Q Set of support points for linearization of motor model in Φ direction
R Set of support points for linearization of motor model in Ψ direction
S Set of simplices for linearization of motor model
T Set of load scenarios
Scalar Description Value
A Slope of domain restriction constraint −1.04314
B Intercept of domain restriction constraint 1.25974
D Normalization length in mm 100
M Module, measured in mm 3
S max Difference in gear ratio between different gears 3
T max Maximum motor momentum in N m 1500
γ Weighting factor 0.05
ηB Bearing efficiency of switchable gears 0.98
ηgG Gear efficiency of gears in mesh 0.99
ηS Sealing efficiency of switchable gears 0.99
πt Probability of occurrence of load scenario t ∈ T 1/|T |
Ω max Maximum motor velocity in s−1 367
Parameter Description
TkW Momentum input for corner point k ∈ K
TtW Momentum input of load scenario t ∈ T
Xq,r Dataset point in Ψ direction
Yq,r Dataset point in Φ direction
Zq,r Efficiency value at position (X, Y )
ΩkW Angular velocity input for corner point k ∈ K
ΩtW Angular velocity input of load scenario t ∈ T
Variable Description Domain
as,t Binary decision variable to select simplex s ∈ S in load scenario t ∈ T {0, 1}
bg,t Binary variable to choose a gear g ∈ G in a load scenario t ∈ T {0,1}
d Distance of shafts in mm {25, ..., 210}
ig Transmission for gear g ∈ G [0, 6]
tM
k,g Motor momentum for corner point k of K and gear g ∈ G [0, T max ]
tM
t Motor momentum in load scenario t ∈ T [0, T max ]
zg,1 Number of teeth of engine gear wheel {17, ..., 70}
zg,2 Number of teeth of output gear wheel {17, ..., 70}
ηtM Approximated motor efficiency in load scenario t ∈ T [−1, 1]
λq,r,t Linearization variable for support point (q, r) in load scenario t ∈ T [0, 1]
Φk,g Normalized momentum for point k ∈ K and gear g ∈ G [0, 1]
Φt Normalized momentum in load scenario t ∈ T [0, 1]
Ψk,g Normalized angular velocity for point k ∈ K and gear g ∈ G [0, 1]
Ψt Normalized angular velocity in load scenario t ∈ T [0, 1]
M
ωk,g Motor angular velocity for point k ∈ K and gear g ∈ G [0, Ω max ]
ωtM Motor angular velocity in load scenario t ∈ T [0, Ω max ]
To avoid inference of engaged gear wheels, the number of teeth of the smaller
wheels are lower bounded by 17, cf. [27, p. 714]. Moreover, we set an upper bound
of 70:
17 ≤ zg,j ≤ 70 ∀g ∈ G, ∀j ∈ {1, 2}. (4a)
To consider the efficiency η M of the electric motor within the optimization pro-
gram, we use a generic functional description of the efficiency map of a perma-
nent magnet synchronous motor (PMSM) in Eq. (5a), cf. Sect. 3.3. The variables
Ψt , Φt ∈ [0, 1] represent the normalized motor momentum and rotational speed,
where the normalization is given by Eqs. (5b) and (5c). Equation (5d) restricts
the possible motor domain to physically relevant parts.
ηtM = f (Ψ, Φ, t) ∀t ∈ T (5a)
tM
t
Ψt = ∀t ∈ T (5b)
T max
ωtM
Φt = ∀t ∈ T (5c)
Ω max
Φt ≤ AΨt + B ∀t ∈ T (5d)
The central part of the gearbox optimization is given by the following constraints:
⎛ ⎞

tM
t
⎝ ig bg,t ηgG ⎠ η B η S = TtW ∀t ∈ T , (6a)
g∈G
⎛ ⎞

ωtM = ΩtW ⎝ ig bg,t ⎠ ∀t ∈ T , (6b)
g∈G
SOS1(bg,t ∀g ∈ G) ∀t ∈ T . (6c)
The sum of gear-dependent transmission ratios ig in Eqs. (6a) and (6b) contains
the binary variable bg,t indicating the used gear in each load scenario. The special
ordered set of type 1 in Eq. (6c) ensures that only one gear is used in each
scenario. The efficiency of a pair of meshing gears is considered in Eq. (6a),
with the constant efficiency factor ηgG . Additional efficiency parameters are the
bearing efficiency η B and the sealing efficiency η S . To get a robust solution, we
add further constraints (Eqs. (7a)–(7e)) to restrict the solution space to solutions
which fulfill the most demanding loads. These most demanding loads correspond
to the convex hull K of all points of the WLTP based demand cycle shown in
Fig. 1.
tM
k,g
Ψk,g = max ∀k ∈ K, ∀g ∈ G (7a)
T
M
ωk,g
Φk,g = max
∀k ∈ K, ∀g ∈ G (7b)
Ω
Φk,g ≤ AΨk,g
r
+B ∀k ∈ K, ∀g ∈ G (7c)
922 P. Leise et al.
tM G B S W
k,g ig ηg η η = Tk ∀ t ∈ T , ∀g ∈ G (7d)
M
ωk,g = ΩkW ig ∀ k ∈ K, ∀g ∈ G (7e)
As an objective, we consider the motor efficiency in each load scenario and the
dimension of the gearbox modeled by the distance d of both shafts. Both terms
are weighted against each other by a user-specific weighting factor γ = 0.05:
d
min γ − (1 − γ) πt ηtM . (8)
D
t∈T
3.3 Mathematical Motor Model

The efficiency of the considered PMSM depends on the specific load conditions.
These dependencies are represented in efficiency diagrams which are derived
from detailed simulations or measurements, cf. [14,17]. Within this paper, we
use the efficiency diagram shown in Fig. 2 (a), which is derived using a radial
basis function approximation of a real motor, yielding an estimator for the motor
efficiency. To integrate the efficiency of the motor into our optimization program,
we use a piecewise linear approximation as shown in Fig. 2 (b) and (c) for a 10×10
and a 30 × 30 grid, respectively. In our computations we used the 30 × 30 grid
which has a 1–4 direction, cf. [18]. The function ηtM = f (Ψ, Φ, t) for all t ∈ T
is modeled employing the aggregated convex combination, cf. [26], yielding the
following constraints:

λq,r,t Xq,r = Ψt ∀t ∈ T , (9a)
(q,r)∈Q×R

λq,r,t Yq,r = Φt ∀t ∈ T , (9b)
(q,r)∈Q×R

λq,r,t Zq,r = ηt ∀t ∈ T , (9c)
(q,r)∈Q×R

λq,r,t = 1 ∀t ∈ T , (9d)
(q,r)∈Q×R

as,t = 1 ∀t ∈ T , (9e)
s ∈S

λq,r,t ≤ as,t ∀t ∈ T . (9f)
s∈S(q,r)
S(q, r) is the set of simplices that are adjacent to vertex (q, r).
Next to the approximation of the motor efficiency, uncertainty in the loads
of the gearbox output is also considered by generating different load scenarios
based on the WLTP load cycle. Thus, the derived MINLP model enables us not
only to find an optimal gearbox layout, but also an optimal control strategy for
changing gears in each load scenario, maximizing the expected efficiency.
a) b) c) 0.95
MOTOR EFFICIENCY
1 0.93
0.91
0.5 0.89
Φ
0.87
0 0.85
0 0.5 1 0 0.5 1 0 0.5 1
0.83
Ψ Ψ Ψ
Fig. 2. (a) Approximation of motor efficiency based on radial basis function. (b) Piece-
wise linear approximation with a 10×10 grid (for illustration only). (c) Piecewise linear
approximation with the used 30 × 30 grid.
4 Results
We use the MINLP solver SCIP 6.0, cf. [11], to optimize the model. The complete
software stack was implemented in Python 3.6.7 using PySCIPOpt 2.1.2 [16].
The computations were done on a Linux-based machine with an Intel i7-6600U
and 16 GB RAM. The gap limit was set to 0.5% and the time limit to 7200 s.
The final results are shown in Table 2. Compared to an optimized one-speed
gearbox, a multi-gear gearbox yields around 1% higher efficiency and around
8–10% decrease in the maximum momentum, cf. Table 2. The solution with 3
gears reaches the time limit. Therefore the optimality gap is slightly higher than
for the other shown results. It is also important to mention that the performance
increase due to more gears also leads to higher diameters.
Table 2. Computational results for 1, 2 and 3 gears within the gearbox.
1 Gear 2 Gears 3 Gears

Gearbox diameter in mm 85 91 99
Highest gear ratio 1.94 2.35 2.44
Maximum momentum decrease 49% 57% 59%
Mean efficiency increase 2.3% 3.6% 3.2%
Solution time in s 150 558 7200
Gap 0.48% 0.45% 2.20%
Objective value −0.83 −0.84 −0.84
924 P. Leise et al.
5 Conclusion and Outlook

We showed an algorithmic approach for finding the global-optimal gearbox
design for BEV. By combining an unsupervised machine learning algorithm and
MINLP, we were able to derive global-optimal multi-gear designs that increase
the efficiency and decrease the maximum momentum of the powertrain for practi-
cally relevant load requirements compared to one-speed designs. To ensure robust
solutions, we considered different load scenarios which were derived based on the
WLTP load cycle. In the future, we want to increase the robustness of our solu-
tions by considering a higher number of load clusters which we derive based on
multiple load cycles besides the WLTP. Additionally, to improve solution times,
we want to investigate linearization techniques for which the number of binaries
is logarithmic in the number of interpolation points.
Funding
Funded by Deutsche Forschungsgemeinschaft (DFG, German Research Founda-
tion) – project number 57157498 – SFB 805.
References
1. Altherr, L.C., Dörig, B., Ederer, T., Pelz, P.F., Pfetsch, M.E., Wolf, J.: A mixed
integer nonlinear program for the design of mechanical transmission systems. Oper.
Res. Proc. 2016, 227–233 (2018)
2. Barber, C.B., Dobkin, D.P., Dobkin, D.P., Huhdanpaa, H.: The quickhull algorithm
for convex hulls. ACM Trans. Math. Softw. (TOMS) 22(4), 469–483 (1996)
3. Belotti, P., Kirches, C., Leyffer, S., Linderoth, J., Luedtke, J., Mahajan, A.: Mixed-
integer nonlinear optimization. Acta Numer. 22, 1–131 (2013)
4. Bishop, C.: Pattern Recognition and Machine Learning, vol. 1. Springer (2006)
5. Bussieck, M.R., Vigerske, S.: MINLP solver software (2010)
6. Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Trans. Pattern
Anal. Mach. Intell. 2, 224–227 (1979)
7. DIN 780-1: Series of modules for gears; modules for spur gears (1977)
8. Dörig, B., Ederer, T., Pelz, P.F., Pfetsch, M.E., Wolf, J.: Gearbox design via mixed-
integer programming. In: Proceedings of the VII European Congress on Compu-
tational Methods in Applied Sciences and Engineering (2016)
9. Egbue, O., Long, S.: Barriers to widespread adoption of electric vehicles: an analysis
of consumer attitudes and perceptions. Energy Policy 48, 717–729 (2012)
10. Gao, B., Liang, Q., Xiang, Y., Guo, L., Chen, H.: Gear ratio optimization and
shift control of 2-speed I-AMT in electric vehicle. Mech. Syst. Signal Process. 50,
615–631 (2015)
11. Gleixner, A., Bastubbe, M., Eifler, L., Gally, T., Gamrath, G., Gottwald, R.L.,
Hendel, G., Hojny, C., Koch, T., Lübbecke, M.E., Maher, S.J., Miltenberger, M.,
Müller, B., Pfetsch, M.E., Puchert, C., Rehfeldt, D., Schlösser, F., Schubert, C.,
Serrano, F., Shinano, Y., Viernickel, J.M., Wegscheider, F., Walter, M., Witt,
J.T., Witzig, J.: The SCIP optimization suite 6.0. Tech. rep. (2018). (Optimization
Online)
12. Grunditz, E.A., Thiringer, T.: Characterizing BEV powertrain energy consump-
tion, efficiency, and range during official and drive cycles from Gothenburg, Sweden.
IEEE Trans. Veh. Technol. 65(6), 3964–3980 (2016)
13. Guzzella, L., Sciarretta, A., et al.: Vehicle Propulsion Systems, vol. 1. Springer
(2007)
14. Hadj, N.B., Abdelmoula, R., Chaieb, M., Neji, R.: Permanent magnet motor effi-
ciency map calculation and small electric vehicle consumption optimization. J.
Electr. Syst. 14(2), (2018)
15. Hofstetter, M., Lechleitner, D., Hirz, M., Gintzel, M., Schmidhofer, A.: Multi-
objective gearbox design optimization for xEV-axle drives under consideration of
package restrictions. Forsch. Im Ing. 82(4), 361–370 (2018)
16. Maher, S., Miltenberger, M., Pedroso, J.P., Rehfeldt, D., Schwarz, R., Serrano, F.:
PySCIPOpt: mathematical programming in python with the SCIP optimization
suite. In: International Congress on Mathematical Software, pp. 301–307. Springer
(2016)
17. McDonald, R.: Electric motor modeling for conceptual aircraft design. In: 51st
AIAA Aerospace Sciences Meeting including the New Horizons Forum and
Aerospace Exposition, p. 941 (2013)
18. Misener, R., Floudas, C.: Piecewise-linear approximations of multidimensional
functions. J. Optim. Theory Appl. 145(1), 120–147 (2010)
19. Rinderknecht, S., Meier, T.: Electric power train configurations and their transmis-
sion systems. In: International Symposium on Power Electronics Electrical Drives
Automation and Motion (SPEEDAM), pp. 1564–1568. IEEE (2010)
20. Salomon, S., Avigad, G., Purshouse, R.C., Fleming, P.J.: Gearbox design for uncer-
tain load requirements using active robust optimization. Eng. Optim. 48(4), 652–
671 (2016)
21. Savsani, V., Rao, R., Vakharia, D.: Optimal weight design of a gear train using
particle swarm optimization and simulated annealing algorithms. Mech. Mach.
Theory 45(3), 531–541 (2010)
22. Schönknecht, A., Babik, A., Rill, V.: Electric powertrain system design of BEV and
HEV applying a multi objective optimization methodology. Transp. Res. Procedia
14, 3611–3620 (2016)
23. Shapiro, A., Dentcheva, D., Ruszczyński, A.: Lectures on stochastic programming:
modeling and theory. SIAM (2009)
24. Tan, S., Yang, J., Zhao, X., Hai, T., Zhang, W.: Gear ratio optimization of a
multi-speed transmission for electric dump truck operating on the structure route.
Energies 11(6), 1324 (2018)
25. Tutuianu, M., Marotta, A., Steven, H., Ericsson, E., Haniu, T., Ichikawa, N., Ishii,
H.: Development of a world-wide worldwide harmonized light duty driving test
cycle (WLTC). Draft Technical Report, DHC subgroup, GRPE-67-03 (2013)
26. Vielma, J.P., Ahmed, S., Nemhauser, G.: Mixed-integer models for nonsepara-
ble piecewise-linear optimization: unifying framework and extensions. Oper. Res.
58(2), 303–315 (2010)
27. Wittel, H., Muhs, D., Jannasch, D., Voßiek, J.: Roloff/Matek Maschinenelemente,
vol. 21. Vieweg + Teubner Verlag (2013)
Location Optimization of Gas Power Plants
by a Z-Number Data Envelopment Analysis
Farnoosh Fakhari(&), R. Tavakkoli-Moghaddam, M. Tohidifard,

and Seyed Farid Ghaderi
School of Industrial Engineering, College of Engineering, University of Tehran,

Tehran, Iran
{farnoosh.fakhari,tavakoli,maryam.tohidi72,ghaderi}
@ut.ac.ir
Abstract. Electricity demand has been ever increasing with growth and
development of our country. Demand for electricity is not only from industrial
consumers but from home consumers. Supply of requisite electric power is
frequently in need of establishing new power plants. Considering the impact of
power plant location on production costs, energy transmission costs, environ-
mental issues, etc., the importance of selecting an optimal location to establish a
power plant is clarified. Because of attention to the major portion of thermal
power plants, namely the gas power plants in electricity production, establishing
such power plants requires a specific concern in Iran. Hence, location-allocation
to a gas power plant in Iran is considered in the presented study. 25 cities are
studied for location-allocation to a gas power plant and the optimum location
should be selected by applying a Z-number data envelopment analysis (Z-DEA).
The proposed approach considers most important and effective indices including
the pollution rate, land cost, economic rate, natural risks, distance from the
electricity distribution network, distance from gas supply station, proximity to
water, population and labor force rate, topographic feasibility, electricity gen-
eration amount and land feasibility. Finally, the fuzzy DEA (F-DEA) model is
also applied to validate the obtained results.
Keywords: Location optimization Gas power plants

Z-number data envelopment analysis Fuzzy numbers Uncertainty
1 Introduction
The population growth, increase of electricity consumption per capita, development of

industrial and agricultural sectors, etc. have been leading to a consistent incremental
demand in this kind of energy in Iran. Power plants as generation resources of electric
power are the most significant parts of transmission networks, and development of such
networks requires establishing new power plants and expanding the existing power
plants. Power plant construction and expansion operations are underlying projects and
Electronic supplementary material The online version of this chapter (https://doi.org/10.1007/

978-3-030-21803-4_92) contains supplementary material.

https://doi.org/10.1007/978-3-030-21803-4_92
Location Optimization of Gas Power Plants … 927
consist in abundant social, economic, political and environmental consequences and for
that, it seems to be necessary to initiate overall studies in advance to construct a power
plant.
There are two types of fuel used in gas power plants, which are gas oil and natural
gas. To generate each megawatt of electricity in such power plants, approximately 55 L
of gas oil and 313 cubic meters of natural gas are being used and versus to other
thermal power plants, gas power plants have some noticeable privileges, such as rapid
installation, lower price and fast launch. A basic step in establishing a gas power plant
is to determine an appropriate location to construct that. Lior (2012) stated that
selecting a location for thermal power plants impacts on the amount of energy gen-
eration, productivity of the power plant, production and transmission costs, economic
development and environment. Moreover, energy and consumption resources influence
on the quality of the environment and other vital resources, such as water and food
(Lior 2012). Selecting an appropriate location needs to consider a variety of criteria and
factors. In location allocation, the effort is to compare parameters together at the same
scale in the present study, in which 25 cities in Iran are evaluated as potential locations
to allocate to the gas power plant. To rank the potential locations, the Z-number data
envelopment analysis (Z-DEA) model is applied as a novel competent model in a
highly uncertain condition.
Based on the previous studies and experts’ opinions, 11 significant and effective
factors in establishing a gas power plant are detected. These indices are categorized into
three groups of techno-economic, social and environmental that are described below.
• Proximity to water: Gas power plants require a noticeable amount of water for their
operation. The consumed water amount depends on several elements, namely the
cooling tower, cooling system, weather condition, the age of the power plant,
maintenance condition, etc.
• Natural risks: Location to construct a power plant shall be selected in a way not to
be in the path of seasonal floods and storms, seismic faults, active volcanoes,
tsunamis, etc., as far as possible.
• Topographic feasibility: The land should be level and free from deep and high
terrains if possible; otherwise, power plant construction will confront with several
difficulties. Hence for choosing the land, it will be better to try finding the low slope
and fairly even surfaces (Azadeh et al. 2014).
• Electricity generation amount: The proximity of the power plant place to the zones
with higher needs of electricity leads to decrease in wastes and economic savings.
• Distance from electricity distribution network: One of the main parameters in
electrical energy waste is the length of transmission lines, hence the closer the
location to the electrical distribution network, the better (Azadeh et al. 2014).
• Land feasibility (hardness/toughness): To construct the power plant on a land of
which underneath stone levels have enough sustainability, it is essential to study the
geometric levels of the area in advance (Azadeh et al. 2014).
• Pollution rate: Because of attention to the cities expansion and rising air pollutants,
most of the big industrial cities confront with polluted air and as this pollution is
harmful and dangerous for the residents’ health.
928 F. Fakhari et al.
• Population and labor force rate: Population centers are a part of main electricity
consumption and proximity of the selected location to them means the proximity of
electricity generation and consumption centers together.
• Land cost: In some areas, the cost of establishing a power plant is higher than the
other areas, also constructing a gas power plant requires a considerable measure of
land (Azadeh et al. 2014).
• Economic rate: The economic growth rate is simply a ratio in terms of percentage
which indicates the incremental value produced by the economy of a country in a
period (usually a year) divided by the changes in the previous period.
• Distance from a gas supply station: the shorter this distance is, the faster and easier
will be the fuel delivery.
The structure of the present study is as follows. The next section reviews the latest
related literature. In Sect. 3, problem-solving stages are described along with the Z-
DEA method. In Sect. 4 results of the Z-DEA and F-DEA models plus the relevant
statistical methods are presented. Finally, the consequences are discussed in Sect. 5.
2 Literature Review
Concerning the variety of electricity generating power plants, there are some studies
done to evaluate them and compare their efficiencies. Some of them are explained
below. Lam and Shiu (2004) measured the efficiency and productivity of electrical
industry in China, applying the DEA and Malmquist index, considering the generated
electricity in each power plant in megawatt hour (MWh) as the output variable and the
nominal capacity (MW), fuel and labor (person) as input variables. Azadeh et al. (2008)
presented a hierarchy approach based on the DEA and performed location allocation to
solar power plants in different cities and areas of Iran. Chatzimouratidis and Pilavachi
(2009) evaluated 10 types of electricity generation power plants with due attention to
technical, economic and sustainability criteria using the analytic hierarchy pro-
cess (AHP) method. The results in the mentioned study showed that renewable ener-
gies were indicated as the best solution for the future as they do not need any fuel and
thus there will not be any fuel cost to generate them. Among those nine power plants,
hydropower, geothermal and wind power plants were respectively at the highest ranks.
Ren (2010) presented a multi-objective model for location-allocation to construct a
thermal power plant with two objectives of minimizing the cost and maximizing the
efficiency. Choudhary and Shankar (2012) performed a study on location-allocation to
a thermal gas power plant in India aiming to minimize socio-economic, environmental
and infrastructure costs and maximize the electricity generation productivity. They
applied the fuzzy AHP and TOPSIS methods to evaluate and select the optimum
location for thermal power plants.
Chatzimouratidis and Pilavachi (2012) proceeded to study and evaluate the electric
generating power plants in Greece from various aspects using the AHP method. Based
on their findings, geothermal power plants, wind power plants, biomass power plants,
nuclear power plants, combined cycle power plants, gas power plants, coal/lignite
power plants and oil power plants were respectively the first preferences of the elec-
trical production in Greece. Asayesh and Raad (2014) proceeded the performance
assessment of 26 gas stations in two northern cities of Iran and identified efficient and
inefficient stations. They applied the DEA method for each gas station as a system of
four input factors and three output factors.
Azadeh et al. (2014) proceeded to determine the optimum location among all alterna-
tives to establish a wind power station. 25 cities considering five districts within each city
were studied, and finally the most efficient city and district was selected by identifying input
and output factors and applying the fuzzy DEA.El-Azab and Amin (2015) reviewed the
solar energy’s current status in the Middle East and North Africa. Also, they proposed an
algorithm for optimizing solar plants site selection. Jahangiri et al. (2016) used a GIS-based
method to determine the best location for wind-solar plants based on the data collected from
400 meteorological stations in the Middle East. Lee et al. (2017) used a hybrid multiple-
criteria decision-making approach for photovoltaic solar plant location selection. Rezaei
et al. (2018) used an MCDM method to determine the best location for the construction of a
wind-solar hybrid plant in the Fars province, Iran. The results show that Eghlid is the best
option for the construction of a wind-solar plant. According to the obtained results, the cities
of Firuzabad, Estahban, Safashahr, Bavanat, Izadkhast and Arsanjan hold the next ranks in
terms of suitability for the construction of solar-wind hybrid plants.
3 Description Model with the Z-DEA
Zadeh (2011) introduced an assumption of the Z-number that could explain experts’
information into a linguistic variable. This variable was an ordered pair (E, F), where
the first number E was the fuzzy constraint and F was defined as the reliability of
E. The proposed model is an integral model based on the Z-number that not only holds
the DEA properties but is capable of considering uncertainties in decision-making units
(DMUs) along with their relevant reliabilities.
Input and output values are in shape of Z-numbers in this model. Values EA f lh are
related to the o-th output for the h-th DMU, where Alh refer to the reliability of them in
shape of triangular fuzzy numbers. Equations (1)–(4) show the CCR model based on
Z-number and Eqs. (5)–(8) are the dual form of Eq. (1) (Azadeh and Kokabi 2016).
Indicators:
H Indicators of DMUs
L Indicators of inputs
O Indicators of outputs
X Number of DMUs
Y Number of inputs
W Number of outputs
DMU n n-th DMU
DMU 0 Target DMU (m = 0)
Parameters:
f lh Z-number value of input l related to DMU h
ZA
f lh Fuzzy value of input l related to DMU h
EA
f lh
FA Fuzzy reliability value of input l related to DMU h
f
ZA oh Z-number value of output o related to DMU h
Variables:
ah Weight variables in the proposed model to obtain the efficiency
X0 Objective value of the (efficiency) DEA model
Min X0 ð1Þ
s.t.
Xa
f lh X0 ZX
ah ZX f l0 ; l ¼ 1; . . .; y ð2Þ
h¼1
Xa
f oh ZX
ah ZY f o0 ; o ¼ 1; . . .; w ð3Þ
h¼1
ah 0; h ¼ 1; . . .; a ð4Þ
Xw
Max X0 ¼ g
vO ZX L0 ¼ 1 ð5Þ
O¼1
s.t.
Xy
f Lo ¼ 1
gl ZY ð6Þ
L¼1
Xq Xq
g
v ZY oh
glh 0;
Vx ZX i ¼ 1; . . .; t ð7Þ
O¼1 o L¼1
vw :gl 0; o ¼ 1; 2; . . .; w; l ¼ 1; . . .; q ð8Þ
The above models are a non-linear one. To make them to linear one, a method to
defuzzify is first used and what will be gained is a set of membership functions of
reliability amounts, F e ¼ fðX:lF ðXÞjX 2 ½0:1g, where l ðXÞ is the membership
e
F
function of the reliability amount. Equation (3) is applied for using the center of gravity
(COG) method (Azadeh and Kokabi 2016).
R
Xle ðxÞ dx
U¼ R F
ð9Þ
Xe ðxÞ dx
F
Assuming that the reliability amounts of DMUs are in shape of triangular mem-
bership functions, then we have:
eþf þd
U¼ ð10Þ
3
Equation (11) transforms the input and output amounts of DMUs into the gravity
Z-number with an abnormal triangular membership function.
Eeh ðXÞ ¼ h EE ðXÞ; x2X

E
s.t.
lhe ðXÞ ¼ h le ðXÞ; x2X ð11Þ

E E
For more details, the esteemed readers are suggested to refer to Azadeh and Kokabi
(2016).
The fuzzy programming of the Z-CCR model is presented in Expressions (12–15).
Equations (16–21) are the dual model of the Z-CCR.
Xw
Max tp ¼ o¼1
v o y l
op :y m u
op :y op ð12Þ
s.t.
Xq
lp lp lp ¼ 1 :1:1
:x :x ð13Þ
l m u l u
l¼1
u l x
Xw Xq
v
o¼1 o
xloh :xm
oh :xoh
u
u xl :xm :xu 0;
l¼1 l lh lh lh
h ¼; . . .; a ð14Þ
vo :ul 0; o ¼ 1; . . .; w; l ¼ 1; . . .; q ð15Þ
Xw
Max bP ¼ x
O¼1 OP
ð16Þ
s.t.
Xq
y
L¼1 LP
¼1 ð17Þ
Xw Xa
y
o¼1 lh
x
l¼1 lh
0; h ¼ 1; . . .; a ð18Þ
lh þ ð1 hÞxlh Þ
ul ððhxm l
xlh ul ðhxm
lh þ ð1 hÞxlh ; h ¼ 1; . . .; q; l ¼ 1; . . .; a
u
ð19Þ
oh þ ð1 hÞyoh Þ
uo ððhym l
yoh vo ðhym
oh þ ð1 hÞyoh ; h ¼ 1; . . .; q; o ¼ 1; . . .; w
u
ð20Þ
vo :ul 0; o ¼ 1; . . .; w; l ¼ 1; . . .; a ð21Þ
4 Experimental Results and Discussion
In this research, information related to the applied indices in this paper were collected
from Statistical Center of Iran, NIGC, and Tavanir. The Statistical Center of Iran has
been established to create a concentrated statistical system aiming to provide accurate
and comprehensive statistics in different fields of economic and social fields to meet the
needs of scientific and research planning in Iran. In this section, the procedure needed
to proceed the study along with results Z-DEA approaches, which are presented for a
location to a gas power plant in Iran are explained in details.
4.1 Determining Effective and Important Criteria

In this subsection, effective factors with techno-economic, social and environmental
aspects are determined based on the recent literature and experts’ ideas.
4.2 Specifying Input and Output Variables

Input variables are those that are more desirable with less amount, which in this case
are: pollution rate, land cost, economic rate, natural risks, distance from the electricity
distribution network and distance from gas supply stations. Output variables are which
more desirable with more amount, including in: proximity to water, population and
labor force rate, topographic feasibility, electricity generation amount and land
feasibility.
4.3 Determining the Efficiency of Each Index with F-DEA and Z-DEA
Models
According to Table 1, the specific reliability is assigned to every input and output
variables based on their involving intervals. The reliability amounts are determined by
experts and stated by three linguistic variables of “sure”, “usually” and “likely”
(Azadeh and Kokabi 2016). Figure 1 shows the amounts of linguistic reliability. Then
Z-DEA model is implemented in Matlab v.2014.
Table 1. Classification of reliability values given by experts (Azadeh and Kokabi 2016)
Z = (E, F) Interval data Linguistic variable Membership functions parameters
[15,20] Sure [0.8, 1, 1]
[10,15) Usually [0.65, 0.75, 0.85]
[1,10) Likely [0.5, 0.6, 0.7]
In the Z-DEA and F-DEA models, it is required to consider different a-cuts due to
the high uncertainty in the model. In this paper, the efficiency of each DMU (i.e., city)
is measured by 14 different a-cuts which are 0.01, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7,
0.8, 0.9, 0.95, 0.99 and 1.
µ
usually sure
likely
0.5 0.6 0.65 0.7 0.75 0.8 0.85 1
Fig. 1. Fuzzy sets of linguistic reliability values (Azadeh and Kokabi 2016)
4.4 Determining the Optimum Alpha Using the Noise Analysis

To find out the optimum alpha, the statistical method of the noise analysis is applied to
both two methods. In this paper, the maximum mean correlation is 0.443, which be-
longs to a = 0.01, thus the optimum cut for the alpha is 0.01.
4.5 Sensitivity Analysis

To determine the most important index for establishing a gas power plant, sensitivity
analysis is used in this study. Each factor is eliminated once and the efficiency is
measured in absence of the mentioned factor. This process is done per all factors. The
results of the sensitivity analysis are presented in Tables S1 and S2 (supplementary).
The paired t-test is implemented in Minitab v.17.3.1 after the sensitivity analysis. Then,
the impact of each factor is identified and all the factors have positive impacts.
4.6 Factor Weighting

After the sensitivity analysis, the weight of each index is calculated by:
j h hi j
Wi ¼ P ð22Þ
jh hi j
where h is the full average efficiency and hi is the average efficiency of the i-th factor
(Fig. 2).
4.7 Identifying the Optimum Location for Establishing a Gas Power

Plant
In this step, the efficiency rank respect to the optimum alpha of both two models (F-
DEA and Z-DEA) is calculated. the city of the first rank is selected as the optimum
location to establish a gas power plant in Iran. As shown in Table 2, Hamedan city is
selected for establishing the gas power plant.
Pollution Cost land Economic

Quantity of proper
rate 4.53% Rate natural
geological area 2.18% 0.56% disasters
consumptio 6.72%
occurrence
n of power 13.86%
30.86%
distance of
gas supply
network
Quantity of 7.12%
proper
topographic Water distance of power
Population and
al areas Accessibility human labor distribution network
8.03% 2.07% 23.94%
0.13%
Fig. 2. Weight of each factor
4.8 Comparing the Z-DEA and F-DEA Models

In this section, the Z-DEA and F-DEA models are compared. To do the comparison,
the mean efficiency relating to the optimum alpha (0.01) is calculated in both models.
The mean efficiency in the Z-DEA model (5.7706) is greater than the F-DEA model
(2.1705); therefore, the Z-DEA model is more appropriate.
Table 2. Correlation between rank efficiency of the Z-DEA and F-DEA

DMU Rank F-DEA Rank Z-DEA DMU Rank F-DEA Rank Z-DEA
(a = 0.01) (a = 0.01) (a = 0.01) (a = 0.01)
Ardabil 17 15 Kermanshah 13 11
East Azerbaijan 23 22 South 14 14
Khorasan
West Azerbaijan 25 25 Razavi 15 21
Khorasan
Bushehr 4 4 Khuzestan 16 18
Chaharmahal 20 20 Kurdistan 21 16
and Bakhttiari
Isfahan 24 24 Lorestan 19 13
Fars 7 5 Markazi 6 6
Gilan 3 2 Semnan 22 17
Hamedan 1 1 Sistan 8 9
Hormozgan 18 23 Tehran 12 19
Ilam 11 8 Yazd 5 7
Karaj 10 12 Zanjan 2 3
Kerman 9 10 Spearman correlation = 0.910
5 Conclusions
Providing the required electricity obligates the development of electric power industry
and subsequent establishment of new power plants. Allocating a proper location to
construct a power plant, greatly impacts on efficiency, energy generation trend, etc. The
main goal of the present study is an appropriate location-allocation to a gas power plant
using the novel Z-DEA method for the first time. In this paper, 11 indices were
introduced. To evaluate and rank the cities as well as weigh the indices, the proposed
model was the Z-DEA. After efficiency measurements with the Z-DEA for 14 a-cuts,
the optimum alpha was determined at 0.01, exerting the noise analysis and statistical
tests. Index weighing and determining the preferences of indices were performed with
the sensitivity analysis and according to the obtained results, the generated electricity
index weighing 30.68%, was the most important index in location-allocation to the
power plant while the population and labor force rate index weighting 0.13% was
identified as the least important index. Hamedan, Gilan and Zanjan were ranked
respectively the most proper locations for allocating to the gas power plant.
References
Aras, H., Erdoğmuş, Ş., Koç, E.: Multi-criteria selection for a wind observation station location
using analytic hierarchy process. Renew. Energy 29(8), 1383–1392 (2004)
Asayesh, R., Raad, Z.F.: Evaluation of the relative efficiency of gas station by data envelopment
analysis. Int. J. Data Envel. Anal. Oper. Res. 12–15 (2014)
Azadeh, A., Ghaderi, S., Maghsoudi, A.: Location optimization of solar plants by an integrated
hierarchical DEA PCA approach. Energy Policy 36(10), 3993–4004 (2008)
Azadeh, A., Kokabi, R.: Z-number DEA: a new possibilistic DEA in the context of Z-numbers.
Adv. Eng. Inform. 30(3), 604–617 (2016)
Azadeh, A., Rahimi-Golkhandan, A., Moghaddam, M.: Location optimization of wind power
generation–transmission systems under uncertainty using hierarchical fuzzy DEA: a case
study. Renew. Sustain. Energy Rev. 30, 877–885 (2014)
Chatzimouratidis, A.I., Pilavachi, P.A.: Technological, economic and sustainability evaluation of
power plants using the Analytic Hierarchy Process. Energy Policy 37(3), 778–787 (2009)
Chatzimouratidis, A.I., Pilavachi, P.A.: Decision support systems for power plants impact on the
living standard. Energy Convers. Manag. 64, 182–198 (2012)
Choudhary, D., Shankar, R.: An STEEP-fuzzy AHP-TOPSIS framework for evaluation and
selection of thermal power plant location: a case study from India. Energy 42(1), 510–521
(2012)
El-Azab, R., Amin, A.: Optimal solar plant site selection. In: Paper presented at the SoutheastCon
2015 (2015)
Jahangiri, M., Ghaderi, R., Haghani, A., Nematollahi, O.: Finding the best locations for
establishment of solar-wind power stations in Middle-East using GIS: A review. Renew.
Sustain. Energy Rev. 66, 38–52 (2016)
Lam, P.-L., Shiu, A.: Efficiency and productivity of China’s thermal power generation. Rev. Ind.
Organ. 24(1), 73–93 (2004)
Lee, A.H., Kang, H.-Y., Liou, Y.-J.: A hybrid multiple-criteria decision-making approach for
photovoltaic solar plant location selection. Sustainability 9(2), 184 (2017)
Lior, N.: Sustainable energy development: the present (2011) situation and possible paths to the
future. Energy 43(1), 174–191 (2012)
Ren, F. Optimal site selection for thermal power plant based on rough sets and multi-objective
programming. In: Paper presented at the E-Product E-Service and E-Entertainment (ICEEE),
2010 International Conference (2010)
Rezaei, M., Mostafaeipour, A., Qolipour, M., Tavakkoli-Moghaddam, R.: Investigation of the
optimal location design of a hybrid wind-solar plant: A case study. Int. J. Hydrogen Energy
43(1), 100–114 (2018)
Zadeh, L.A.: A note on Z-numbers. Inf. Sci. 181(14), 2923–2932 (2011)
Optimization of Power Plant Operation
via Stochastic Programming with Recourse
Tomoki Fukuba1, Takayuki Shiina1(&), Ken-ichi Tokoro2,

and Tetsuya Sato1
1
Waseda University, 3-4-1 Ohkubo, Shinjuku, Tokyo 169-855, Japan
tshiina@waseda.jp
2
Central Research Institute of Electric Power Industry, 2-6-1 Nagasaka,
Yokosuka, Kanagawa 240-0196, Japan
Abstract. A stochastic programming model of the operation of energy plants

with the introduction of photovoltaic generation and a storage battery is
developed. The uncertainty of the output of the photovoltaic generation is
represented by a set of discrete scenarios, and the expected value of the oper-
ation cost is minimized. The effectiveness of the stochastic programming model
by comparing it with the deterministic model is shown. As an economic eval-
uation, the recovery period for the initial investment of photovoltaic generation
and storage battery is also shown.
Keywords: Stochastic programming Optimization Energy plant

Operational planning Photovoltaic generation Unit commitment problem
1 Introduction
Because of the prevalent environmental problems, the need to spread awareness about
the use of renewable energy is an urgent concern worldwide. In the Paris Agreement
issued in 2016, the long-term goal was to keep the global average temperature increase
within 2 °C as a result of the industrial revolution. To achieve this, efforts are under
way to adopt smart community all over the world. Furthermore, in Japan, the reex-
amination of energy costs has been dealt with as a problem of management engineering
due to the liberalization of electricity. Based on these, introduction of renewable energy
as a new type of energy supply in large-scale facilities such as factories and shopping
centers is being studied. However, as the output of renewable energy is unstable,
decision making under uncertain conditions is required at the time of introduction.
In this research, an optimization model for operation planning using stochastic
programming by introducing photovoltaic power generation as renewable energy into
factory energy plant was developed. We showed that the modeling by stochastic
programming is more suitable as a realistic operation plan than conventional deter-
ministic mixed integer programming method and economic evaluation on the imple-
mented plan was carried out.
Figure 1 shows the outline of the basic model which can quantitatively evaluate the
energy cost etc. of the smart community. An industrial model of energy consumption at
the factory in the basic model is modeled based on a benchmark problem seeking an
https://doi.org/10.1007/978-3-030-21803-4_93
938 T. Fukuba et al.
optimum operation plan of a plant energy plant presented by Suzuki and Okamoto [1],
Inui and Tokoro [2].
Fig. 1. Energy plant with photovoltaic generation.
2 Introduction Model for Photovoltaic Generation
The optimization model for photovoltaic generation extends the benchmark problem by
including photovoltaic generation and a storage battery. To consider the uncertainty of
photovoltaic power generation, stochastic programming is applied.
The benchmark problem is of an energy plant that purchases electricity and gas as
shown in the dotted frame in Fig. 1, and generates electricity, heat, and steam to meet
the demand. As equipment, there are a gas turbine, a boiler, two kinds of refrigerators,
and a thermal storage tank. The objective of the benchmark problem is to establish an
operation plan that minimizes the cost of purchasing electricity and gas while satisfying
the constraints on equipment and energy balance. Decision variables consists of vari-
ables related to the amount of purchase and generation of energy, and variables con-
cerning the start and stop of each device.
The photovoltaic generation introduction model includes MW-class photovoltaic
generation equipment so that the generated electricity flows to the demand and turbo
refrigerators. The photovoltaic generation of power is for in-house power consumption
and for sale. We assumed that the storage battery can store only the electric power of
photovoltaic power generation, and that some of the charged electricity will be lost by
the time of discharge. Figure 1 shows the energy flow of the energy plant when
photovoltaic generation and storage batteries are introduced.
The uncertainty of photovoltaic generation output is expressed using a set of
deterministic scenarios. Because the uncertainty of the output of photovoltaic gener-
ation also affects the entire energy flow, decision variables related to the purchase
amount of energy and generation amount are also defined for each scenario. As a result,
the number of decision variables increases according to the number of scenarios. When
the number of scenarios is 30, the number of decision variables increases from 192 to
5826, considering the introduction of photovoltaic power generation.
Optimization of Power Plant Operation 939
The objective of the photovoltaic generation introduction model is to establish an

operation plan that minimizes the expected value of electricity and gas purchase cost
while satisfying the restrictions on equipment and storage battery and energy balance.
3 Formulation of Photovoltaic Generation Introduction

Model
Table 1 shows the definitions of the symbols used for formulating the photovoltaic
power generation introduction model:
Table 1. Notation for the model.

Parameter
Nt ; Ns : Number of turbo refrigerators and steam absorption
refrigerators
I: Number of time zones
age : Coefficient of input and output of gas and the electric
power in gas turbine
ags : Coefficient of input and output of gas and steam in gas
turbine
ab : Coefficient of input and output of gas and steam in boiler
at;j : Coefficient of input and output of power and heat in the
turbo refrigerator j
as;j : Coefficient of input and output relational expression of
bs;j : quantity of steam and heat at steam absorption refrigerator j
cs;j :
t;j ; Qt;j :
Qmin max Lower and upper limits of heat production of turbo
refrigerator j
s;j ; Qs;j :
Qmin max Lower and upper limits of heat production amount of steam
absorption refrigerator j
Egmin ; Egmax : Lower and upper limits of power generation amount of gas
turbine
b ; Sb :
Smin max Lower and upper limits of boiler steam production
Qmin
ts :
Lower limit of heat storage amount of thermal storage tank
Qmax1
ts : Upper limit of the thermal storage amount of the thermal
storage tank in the first time zone to the (I - 1) time zone
Qmax2
ts : Upper limit of the thermal storage amount of the thermal
storage tank in the I th time zone
Qinit
ts : Initial thermal storage of the thermal storage tank
Qloss : The amount of heat loss in the thermal storage tank
Lt;j ; Ls;j : Minimum startup/stop time of the turbo refrigerator j and
steam absorption refrigerator j
Lg ; Lb : Minimum startup/stop time of the gas turbine and boiler
(continued)
i
CEr ; CFr
i
: Purchase cost of electricity and gas in time i
i i i
EL ; QL ; SL : Demand of electricity, heat, and steam in time i
Ei ; S ~i : Remaining amount of electric energy and steam in time i
rm rm
K: Number of scenarios
Prk : Probability of scenario k
i;k Electric energy of photovoltaic power generation in time
Epv :
i under scenario k
zinit
sb : Initial storage of storage battery
max
Esb : Capacity of storage battery
a: Charge and discharge efficiency
Decision Variable
xi;k i;k
t;j ; xs;j :
Heat production of the turbo refrigerator j and steam
absorption refrigerator j in time i under scenario k
i;k
g ; xb :
xi;k Gas consumption of gas turbine and the boiler in time
i under scenario k
yit;j ; yis;j : State of the turbo refrigerator j and the steam absorption
refrigerator j in time ið1 for running; 0 for stopped)
yig ; yib : State of the gas turbine and the boiler in time i (1 for
running, 0 for stopped)
zi;k
in :
Electric energy to be charged from the photovoltaic
generation to the storage battery in time i under scenario k
zi;k
out :
Discharge of storage battery in time i under scenario k
zi;k
pv :
Electric energy directly consumed from photovoltaic
power generation in time i under scenario k
zi;k
sb :
The storage of the storage battery in time i under scenario k
Function

i1;k i The heat storage of the thermal storage tank in time i under
Qi;k i;k i;k
ts xt ; xs ; Qts ; QL :
scenario k

i;k i;k i;k i Purchase of electric power in time i under scenario k
g ; xt ; zpv ; zout ; EL ; Erm :
Eri;k xi;k i

i;k i;k i;k i Remaining amount of steam in time i under scenario k
rm xg ; xb ; xs ; SL :
Si;k

fge xi;k Power generation of the gas turbine in time i under
g :
scenario k

fgs xi;k The steam generation of the gas turbine in time i under
g :
scenario k

fb xi;k Steam generation in the boiler in time i under scenario k
b :

ft;j xi;k Power input of the turbo refrigerator j in time i under
t;j :
scenario k

fs;j xi;k Steam input of steam absorption refrigerator j in time
s;j :
i under scenario k
In this model, xi;k t ; xs 8iði ¼ 1; ; I Þ, 8k ðk ¼ 1; ; K Þ

i;k
is defined as
T T
xi;k
t ¼ xi;k
t;1 ; ; x i;k
t;Nt , x i;k
s ¼ x i;k
s;1 ; ; x i;k
s;Ns .
The formulation of the photovoltaic generation introduction model is shown below:
I n o
P
K P i;k i;k i;k i i;k
g ; xt ; zpv ; zout ; EL ; Erm þ CFr xg þ xb
i
min Prk CEr Eri;k xi;k i i i;k
ð1Þ
k¼1 i¼1
s:t:
i1;k i

ts Qts xt ; xs ; Qts
Qmin ; QL Qmax1 ; i ¼ 1; ; I 1; k ¼ 1; ; K ð2Þ
i;k i;k i;k
ts
i1;k i

ts Qts xt ; xs ; Qts
Qmin ; QL Qmax2 ; i ¼ I; k ¼ 1; ; K ð3Þ
i;k i;k i;k
ts

i;k i;k i;k ~i ~i
rm xg ; xb ; xs ; SL ¼ Srm ; i ¼ 1; ; I; k ¼ 1; ; K
Si;k ð4Þ
i;k
t;j yt;j xt;j Qt;j yt;j ; i ¼ 1; ; I; j ¼ 1; ; Nt ; k ¼ 1; ; K
Qmin ð5Þ
i max i
i;k
s;j ys;j xs;j Qs;j ys;j ; i ¼ 1; ; I; j ¼ 1; ; Ns ; k ¼ 1; ; K
Qmin ð6Þ
i max i

Egmin yig fge xi;k
g Egmax yig ; i ¼ 1; ; I; k ¼ 1; ; K ð7Þ

Smin
b y i
b f b xi;k
b b yb ; i ¼ 1; ; I; k ¼ 1; ; K
Smax i
ð8Þ
s

yit;j yi1
t;j yt;j ; s ¼ i þ 1; ; min i þ Lt;j 1; I ; i ¼ 2; ; I; j ¼ 1; ; Nt ð9Þ
s

t;j yt;j 1 yt;j ; s ¼i þ 1; ; min i þ Lt;j 1; I ;
yi1 i
ð10Þ
i ¼2; ; I; j ¼ 1; ; Nt
s

yis;j yi1
s;j ys;j ; s ¼i þ 1; ; min i þ Ls;j 1; I ;
ð11Þ
i ¼2; ; I; j ¼ 1; ; Ns
s

s;j ys;j 1 ys;j ; s ¼i þ 1; ; min i þ Ls;j 1; I ;
yi1 i
ð12Þ
i ¼2; ; I; j ¼ 1; ; Ns
s

yig yi1
g yg ; s ¼ i þ 1; ; min i þ Lg 1; I ; i ¼ 2; ; I ð13Þ
s

g yg 1 yg ; s ¼ i þ 1; ; min i þ Lg 1; I ; i ¼ 2; ; I
yi1 ð14Þ
i
s
yib yi1
b yb ; s ¼ i þ 1; ; minfi þ Lb 1; I g; i ¼ 2; ; I ð15Þ
s
b yb 1 yb ; s ¼ i þ 1; ; minfi þ Lb 1; I g; i ¼ 2; ; I
yi1 ð16Þ
i
zi1;k
sb þ azi;k i;k i;k 0;k I;k
in ¼ zsb þ zout ; i ¼ 1; ; I; k ¼ 1; ; K; zsb ¼ zsb ¼ zsb
init
ð17Þ
zi;k
sb Esb ; i ¼ 1; ; I; k ¼ 1; ; K
max
ð18Þ
i;k
i;k
Epv ¼ zi;k
pv þ zin ; i ¼ 1; ; I; k ¼ 1; ; K ð19Þ
xi;k
t;j 0; i ¼ 1; ; I; j ¼ 1; ; Nt ; k ¼ 1; ; K ð20Þ
xi;k
s;j 0; i ¼ 1; ; I; j ¼ 1; ; Ns ; k ¼ 1; ; K ð21Þ
g ; xb ; zin ; zout ; zpv 0; i ¼ 1; ; I; k ¼ 1; ; K

i;k i;k i;k
xi;k i;k
ð22Þ
zi;k
sb 0; i ¼ 1; ; I 1; k ¼ 1; ; K ð23Þ
yit;j 2 f0; 1g; i ¼ 1; ; I; j ¼ 1; ; Nt ð24Þ
yis;j 2 f0; 1g; i ¼ 1; ; I; j ¼ 1; ; Ns ð25Þ
yig ; yib 2 f0; 1g; i ¼ 1; ; I ð26Þ
Where
Nt n
X o XNs n o
i;k i;k i1;k i
¼ xi;k xi;k i þ Qloss ;
ts xt ; xs ; Qts
Qi;k ;Q s;j þ Qts þQ
i1;k
ð27Þ
L t;j L
j¼1 j¼1
i ¼ 1; ; I; k ¼ 1; ; K; Q0;k
ts ¼ Qinit
ts
XNt n o
i;k i;k i;k i Li fge xi;k
Eri;k xi;k ; x ; z ; z ; E ; E i
¼ ft;j xi;k þE þ Erm
i
zi;k
pv zout 0;
i;k
g t pv out L rm
j¼1
t;j g
ð28Þ
i ¼ 1; ; I; k ¼ 1; ; K
X Ns n o
i;k i;k i;k i
rm xg ; xb ; xs ; SL ¼ fgs xg
Si;k i;k
þ fb xi;k fs;j xi;k SiL ;
b s;j
j¼1 ð29Þ
i ¼ 1; ; I; k ¼ 1; ; K

fge xi;k
g g ; i ¼ 1; ; I; k ¼ 1; ; K
¼ age xi;k ð30Þ

fgs xi;k
g g ; i ¼ 1; ; I; k ¼ 1; ; K
¼ ags xi;k ð31Þ

fb xi;k
b ¼ ab xi;k
b ; i ¼ 1; ; I; k ¼ 1; ; K ð32Þ

ft;j xi;k i;k
t;j ¼ at;j xt;j ; i ¼ 1; ; I; j ¼ 1; ; Nt ; k ¼ 1; ; K ð33Þ
xi;k
s;j
fs;j xi;k
s;j ¼ n o2 ; i ¼ 1; ; I; j ¼ 1; ; Ns ; k ¼ 1; ; K
as;j xi;k
s;j þ bs;j xi;k
s;j þ cs;j
ð34Þ
Objective function (1) represents the minimization of the expected value of the
purchase cost of electricity and gas. Constraints (2) and (3) represent the capacity
constraints of the thermal storage tank. Inequality (2) represents the case of the first
time zone to the (I − 1) time zone, and (3) represents the case of the I time zone.
Equation (4) is a constraint on the remaining amount of steam. Inequalities (5)–(8)
represent the capacity constraints of the energy production amounts of two types of
refrigerators, gas turbines, and boilers. Inequalities (9)–(16) are constraints on on/off
decision of each device, and they are represented by a linear inequality based on the
unit commitment problem [3]. For these restrictions, two constraints are used for each
unit, and the first one means that once the unit starts, it must keep its operating state for
a certain period of time. The second one represents the case of stop. Constraint (17) is
related to the storage amount of the storage battery. To consider that a certain amount
of electric power is lost at the time of charge and discharge, zi;k in is multiplied by
efficiency a. The initial charge amount and the storage amount in time zone I are the
same. Constraint (18) represents the capacity of the storage battery.
Equation (19) shows that the power generated by the photovoltaic generation is
divided into the power to satisfy the demand and the power charged in the storage battery.
Constraints (20)–(26) represent non-negative constraints and 0–1 constraints of decision
variables.
Function (27) represents the heat storage amount of the thermal storage tank.
Function (28) represents the purchase amount of the electricity. However, as it does not
consider power selling, it imposes non-negativity constraints. Function (29) repre-sents
the remaining amount of steam. Function (30)–(34) represent the relational expressions
of the input and output quantities of the gas turbine, the boiler, and the two types of
refrigerators. Function (34) represents the relationship between the input and output
amounts of the steam absorption refrigerator. This function is expressed in the form of
non-convex nonlinear constraints to account for a practical operating plan. Because
these types of constraints are non-convex, they are difficult to deal with.
4 Piecewise Linear Approximation of Nonlinear Constraint

Equation
This research model includes the nonlinear constraint Eq. (34) expressing the relationship
between the input and output quantities of the steam absorption refrigerator. To treat the
model as a mixed integer programming problem, it was linearized by piecewise linear
approximation [4] on (34).

Let fs;j xi;k
s;j be the approximation of f s;j xi;k
s;j and xs;j;l ; fs;j xs;j;l 8i; j;
kði ¼ 1; ; I; j ¼1;
; Ns ; k ¼ 1; ; K Þ be the split points of the function. The
approximation fs;j x i;k
is given by Eqs. (36) and (37).
s;j
Constraints (38) and (39) represent the SOS 2 constraint that requires only two
adjacent ki;k
j;l is at most positive. Because of adding the decision variables to the
piecewise linear approximation, the number of decision variables further increases,
resulting in a large-scale mixed integer programming problem.
X
pj
xi;k
s;j ¼ ki;k
j;l xs;j;l ð35Þ
l¼1
X pj

fs;j xi;k ¼ ki;k ð36Þ
s;j j;l fs;j xs;j;l
l¼1
X
pj
ki;k i;k
j;l ¼ 1; kj;l 0; l ¼ 1; ; pj ð37Þ
l¼1
9
ki;k i;k
j;1 lj;1 >
>
>
>
i;k i;k
kj;2 lj;1 þ lj;2i;k =
.. ð38Þ
. >
>
>
i;k >
;
i;k i;k
kj;pj lj;pj 1 þ lj;pj
X
pj
li;k i;k
j;l ¼ 1; 0 lj;l 2 Z; l ¼ 1; ; pj ð39Þ
l¼1
The exact algorithm using piecewise linear approximation is as follows: For each
iteration, we increase the number of split points and improve the accuracy of piece-wise
linear approximation. This made it possible to solve a large-scale mixed integer pro-
gramming problem.
Piecewise Linear Approximation Algorithm
Step 0: Given initial number of split points, and tolerance e.

Step 1: Set the initial split points.
Step 2: Solve the problem of piecewise linear approximation of constraint Eq. (34).

Step 3:
xi;k
[ e at the optimal solution ^xi;k , add ^xi;k ; fs;j ^xi;k

If
fs;j ^xi;k
s;j fs;j ^s;j s;j s;j s;j to the
set of split points.
Step 4: If there are additional split points, the process returns to Step 2. If there are no
points added, stop.
5 Evaluation of Solution by Stochastic Programming
For evaluating the solution of the stochastic programming method, we use the value
VSS (value of stochastic solution) [5] of the solution of the stochastic programming
problem.
If the random variable is defined as n, we define the optimization problem on the
realization n of the random variable n as follows:

minx zðx; nÞ ¼ cT x þ miny qT yjWy ¼ h Tx; y 0 ð40Þ
s:t: Ax ¼ b; x 0 ð41Þ
The optimum objective function value RP (recourse problem) of the stochastic pro-
gramming problem is defined as follows:
RP ¼ minx En zðx; nÞ ð42Þ
We define the optimal objective function value ADP (average deterministic problem) of

n is replaced by its mean value n,
a deterministic problem in which the random variable

and let the optimal solution of the problem be x n .

ADP ¼ minx z x;
n ð43Þ

The optimal objective function value RP n when x
n is applied to the stochastic
programming problem is defined as follows.

RP n ¼ En z x n ; n ð44Þ
VSS is defined as follows.

VSS ¼ RP n RP ð45Þ

Because the optimal solution obtained in the problem of finding RP n is a feasible
solution to the problem for obtaining RP, the following relation holds.

RP RP n ; VSS 0 ð46Þ
The problem for obtaining RP is a stochastic programming model that is, formu-
lated as (1)–(34). On the other hand, the problem of finding ADP is a deterministic
model that considers the output of the photovoltaic generation fixed at the average

value. RP n becomes a problem to find out how much expense the deterministic
solution in the problem for ADP will be under uncertainty.
Based on the benchmark problem data, we conducted numerical experiments to

obtain a daily operation plan. However, only one steam absorption refrigerator is used.
Demand for power, heat, and steam during the daytime is approximately about 20
[MWh], 20 [GJ], 10 [t], respectively. The capacity of photovoltaic power generation is
set to 4 [MW], and the capacity of storage battery is compared in 4 ways for of 0.5, 1,
1.5, 2 [MWh].
The initial storage amount zinit
sb of the storage battery was 30% of the storage battery
capacity and the charge/discharge efficiency a was set to 0.8.
A scenario representing the uncertainty of photovoltaic power generation is created
on a monthly basis based on the horizontal level total solar insolation in Tokyo in the
average year of NEDO(New Energy and Industrial Technology Development Orga-
nization) database (METPV - 11) [6].
According to the Agency for Natural Resources and Energy of Ministry of Econ-
omy, Trade and Industry [7, 8], the annual power generation of photovoltaic power is
approximately 1100 [kWh] per 1 [kW] capacity. To satisfy the relationship between
this capacity and annual power generation amount, the scenario was generated by
multiplying the horizontal plane total solar insolation by a constant. The number of
scenarios is the number of days of the month, and the probability of each scenario is 1/
(the number of days of the month). We used AMPL as the modeling language and
Gurobi 7.5.0 as the solver. The parameter of Gurobi was MIPGAP = 10−7. The
piecewise linear approximation parameter was set as 256 for the initial number of split
points and e = 10−6.
First, we compare RP and RP n for evaluating the model by stochastic pro-

gramming. Table 2 shows a comparison of RP and RP n in the case the storage
battery capacity is set to 1 [MWh]. The maximum value of VSS is 2,007 [yen] in
November, equivalent to 2.3 [%] of cost reduction due to the introduction of photo-
voltaic power. The operating cost per day before the introduction of photovoltaic
generation was 4,042,763 [yen].
Next, the recovery period for the initial investment cost is calculated. According to
the Ministry of Economy, Trade and Industry [7, 8], the initial investment cost of
photovoltaic generation of capacity 1 [MW] or more is 27.5 [yen/kW], the cost of NAS
battery used as a large storage battery is 4 [104 yen/kWh]. Based on the internal rate of
return [9], the recovery period for the initial investment cost is calculated by finding
n as shown in (47) given the discount rate r. In this formula, the difference in cost is the
difference in the annual cost before and after installing photovoltaic power generation.
X
n
difference of cost
initial investment cost ¼ ð47Þ
i¼1 ð1 þ r Þi
The calculation result of the discount rate r = 1 [%] is listed in Table 3. Comparing the
recovery years with the present model by using the stochastic programming method
and deterministic model, the initial investment cost recovery period is smaller in this
research model. Compared with the capacity of the storage battery, the larger the
capacity, the larger the recovery period.
max
Table 2. RP and RP n Esb ¼ 1½MWh .

Month RP½yen RP n ½yen Month RP½yen RP n ½yen
1 3,946,209 3,947,316 7 3,872,237 3,873,733
2 3,921,607 3,921,607 8 3,880,117 3,881,212
3 3,906,488 3,908,086 9 3,925,601 3,925,601
4 3,881,487 3,882,983 10 3,938,757 3,938,757
5 3,868,814 3,870,060 11 3,953,877 3,955,884
6 3,896,726 3,898,624 12 3,956,763 3,956,763
Table 3. Recovery period for initial investment ðr ¼ 1½%Þ.

Storage battery capacity Stochastic programming Deterministic model
0:5½MWh 27.02 27.19
1½MWh 27.50 27.74
1:5½MWh 28.00 28.30
2½MWh 28.52 28.78
In this research, we extended the benchmark problem on energy plant operation of

large-scale facilities to the problem including photovoltaic power generation and
storage battery and showed that modeling by stochastic programming method is useful.
The new model for the introduction of large capacity photovoltaic power generation
was developed. And the evaluation for the introduction of photovoltaic power gener-
ation for the purpose of self-power generation at large facilities has been made possible.
References
1. Suzuki, R., Okamoto, T.: An introduction of the energy plant operational planning problem: a
formulation and solutions. In: IEEJ International Workshop on Sensing, Actuation, and
Motion Control (2015)
2. Inui, N., Tokoro, K.: Finding the feasible solution and lower bound of the energy plant
operational planning problem by an MILP formulation. In: IEEJ International Workshop on
Sensing, Actuation, and Motion Control (2015)
3. Shiina, T., Birge, J.R.: Stochastic unit commitment problem. Int. Trans. Inoperational Res. 11
(1), 19–32 (2004)
4. Nemhauser, G., Wolsey, L.A.: Integer and Combinatorial Optimization. Wiley (1989)
5. Birge, J.R., Louveaux, F.: Introduction to Stochastic Programming. Springer, New York
(1997)
6. NEDO New Energy and Industrial Technology Development Organization. http://www.nedo.

go.jp/library/nissharyou.html. Accessed 03 Nov 2017
7. Ministry of Economy, Trade and Industry. http://www.enecho.meti.go.jp/category/saving_
and_new/ohisama_power/common/pdf/guideline-2013.pdf. Accessed 18 Dec 2017
8. Ministry of Economy, Trade and Industry. http://www.enecho.meti.go.jp/committee/council/
basic_problem_committee/028/pdf/28sankou2-2.pdf. Accessed 18 Dec 2017
9. Luenberger, D.G.: Investment Science. Oxford University Press (1998)
Randomized-Variants Lower Bounds
for Gas Turbines Aircraft Engines
Mahdi Jemmali(B) , Loai Kayed B. Melhim , and Mafawez Alharbi
Department of Computer Science and Information, College of Science,

Majmaah University, Al-Majmaah 11952, Saudi Arabia
mah jem 2004@yahoo.fr, m.jemmali@mu.edu.sa
Abstract. This paper focuses on the problem of developing a heuristic

model for identical aircraft gas turbine engines maintenance interven-
tions. Each turbine has some parts which requires replacement (changing
the used part by a new one) at well determined periods. The maintenance
problem of identical aircraft gas turbine engines will be addressed in this
research, in order to maximize crafts operation time without affecting
engine maintenance schedule. Maintaining the turbine is performed by
using new or refurbished parts, at specific predetermined periods. These
parts (new or refurbished) have a predetermined lifespan; this research
discusses how to replace a sequence of turbine parts in the turbine main-
tenance process, in order to maximizing aircraft operating time. In this
research, 4 heuristics were developed to achieve the proposed goal. ana-
lyzing the obtained results showed that heuristic R4 obtained the best
T umin value.
Keywords: Heuristic · Scheduling · Randomization algorithms ·

Parallel machines
1 Introduction
Cost management in the aviation sector is a critical factor, considering the tight
profit margins and the instability of economic performance. Due to the speci-
ficity of the aviation sector; normal errors can be catastrophic and lead to huge
losses. The aircraft maintenance sector is one of the most important and most
expensive air transport sectors after direct operational cost. In aircraft mainte-
nance, engine maintenance represents the highest cost and the most effective on
aircraft operations and on the continuity of companies owning these aircrafts,
and here comes the role of scheduling engine maintenance operations; to avoid
errors and to ensure the longest working period and the lowest downtime and
the highest financial return. The main goal of this research is to maximize crafts
operation time without affecting engine maintenance schedule. This goal will be
The authors would like to thank the Deanship of Scientific Research at Majmaah
University for supporting this work.
https://doi.org/10.1007/978-3-030-21803-4_94
950 M. Jemmali et al.
formulated as a maximization of the minimum completion time problem by a

heuristic that is developed to solve this problem.
The reducing of costs in a modular engine maintenance concept domain is
used with the Air Force F100–PW–100 engine maintenance in [1]. A case study
of the proposed problem was developed in [2]. In the latter work, two models
have been developed. The first one is based on the formulation of linear program-
ming. However, the second one is based on some routines to solve approximately
the problem that is based on the division of the original problem into different
subproblems.
The problem of maximization of the minimum (in general) is proposed, for
the first time, as an approximation scheme by [9]. In [6] authors proposed an
on-line ordinal assignment problem with two objectives. the first objective was
minimizing the lp norm of the makesapn while the other objective was maxi-
mizing the minimum machine completion time. Authors in [3] proposed the first
optimal solution using the branch and bound method and utilizing the tight
lower and upper bounds by developing several algorithmic features. The litera-
ture is impressive for this kind of problem. For a comprehensive survey related
to the problem of minimizing the maximum, we advise the readers to revise the
work presented by Mokotoff [5]. The researchers in [7] search for optimal solu-
tions to the makespan minimization problem on identical parallel machines. They
resulted with deriving dominance rules that outperform existing approaches, as
the authors stated by the results of their experiments. The authors also in [8]
propose exact branch-and-bound algorithm to enhance their effective dominance
criteria for small ratios of n to m based on structural patterns of optimal sched-
ules.
This paper is organized as follows. Section 2, presents a description of the
studied problem. In Sect. 3, we present the proposed heuristics given the new
lower bounds. An experimental study is detailed in Sect. 4. Finally, a conclusion
in given in Sect. 5.
2 Problem Description
Maximization the minimum completion time problem is described as follows.
Consider a set P that contains a fixed number of spare parts Spn , that has to
be assigned to a deterministic number of turbines T un . Each turbine will be
indexed by i and denoted by T ui . The lifespan of each part j is denoted by lpj .
Each turbine require some parts to be replaced and each turbine has at most
one part at a time. The available parts have not got a release date i.e parts
required for the maintenance process are ready for immediate delivery (delivery
processing time = 0). This problem focuses on maximization of the minimum
turbines operating denoted by T umin . We denote by Clj the cumulative lifespan
of part j.
Example 1. Let SPn = 5 and T un = 2. Table 1 display the lifespan lpj for each
part j.
Randomized-Variants Lower Bounds for Gas Turbines Aircraft Engines 951
Table 1. Lifespan-parts distribution for Example 1.
j 1 2 3 4 5
lpj 3 5 2 11 4
We assign all parts on the turbines applying any algorithm. The schedule
given by the assignment method is illustrated in Fig. 1. It is clearly to see that
turbine 1 has parts 3, 1 and 4. Contrariwise, for turbine 2, parts 5 and 2 are
picked.
Turbine 1
3 1 4
Turbine 2
5 2
Fig. 1. Parts-turbines dispatching for Example 1.
Based on Fig. 1, turbine 1 has a total lifespan 16. However, turbine 2 has
a total lifespan of 9. The maximum operating time is T umax = 16. However,
the minimum operating time is T umin = 9. The objective is to maximize the
minimum operating time T umin . So, we have to seeking for other more efficient
algorithm given a minimum operating time greater than 9.
Using the standard three-field notation of [4], this problem can be denoted
as P ||Cmin .
3 Proposed Lower Bounds
Our study is basically based on the comparison of the LP T dispatching rule

given in [3] with some proposed lower bounds. In this paper we propose 4 lower
bounds which are based on the randomization of the choice of the part which
will be schedule on the turbine.
3.1 Non-increasing Lifespan Order Heuristic (LP T )
We order all parts in the non-increasing order of its lifespan. After that, we
assign the parts having the greatest lifespan on the more available turbine.
3.2 Randomized Based Heuristics Rk
For this type of lower bounds, our study is basically articulated on the choice
of the part having the largest lifespan to be scheduled on turbine which having
the minimum total operating time with some probabilistic method. The lower
bounds is based on probabilistic choice between the k largest part with k ∈
{2, 3, 4, 5} respectively for the lower bounds R1 , R2 , R3 and R4 . The selected
part is chosen among the k first parts having largest lifespan with probability β.
This probability is fixed as following:
– We chose randomly a number r in [1 − k]. The selected part will be the rth
largest unscheduled part. We schedule the selected part on the most available
turbine.
– Denoted by Up the number of unassigned parts. If Up < k, then r will be
chosen randomly between [1 − Up ].
For a fixed k, we fix the number of iteration to 1000. The algorithm of the

randomized parts-turbines is given as following:
Algorithm 1. Randomized parts-turbines algorithm: RP T (k)

1: Set it = 1.
2: Set Pk = P .
3: Chose randomly r between [1 − k].
4: Schedule the rth largest parts, which will be denoted by Lp , on the more available
turbine.
5: Pk = Pk \ Lp , if Pk = {} goto 3.
6: Calculate the T uit
min
7: it = it + 1.
8: if it ≤ 1000 goto 2.
9: Stop, return RP T (k) = max T uit min .
1≤it≤1000
The Algorithm 1, gives a result for a fixed k but not the result of the proposed
heuristic. Indeed, the proposed heuristic need the iteration from 2 to k. The
following algorithm calculate the value obtained by the proposed heuristic Rk
for a fixed k.
Algorithm 2. Randomized based heuristics algorithm Rk

1: for j = 2 to k do.
2: Calculate RP T (j).
3: end for
4: Rk = max RP T (j).
2≤j≤k
This section presents an analysis of the obtained results of the developed heuris-
tics. These heuristics were coded with Microsoft Visual C++ (Version 2013),
then were executed on an Intel(R) Core(TM) i5-3337U CPU @ 1.8 GHz and 8
GB RAM. The operating system used is windows 10 with 64 bits.
The proposed lower bounds were tested on a set of instances. These instances was
generated as described in [3]. The lifespan lpj was generated according to differ-
ent probably distributions. Each one of the probability distributions represents
a class. The classes are:
– Class 1: lpj is generated from the discrete uniform distribution U [1, 100].
– Class 4: lpj is generated from the normal distribution N [50 − 100].
– Class 5: lpj is generated from the normal distribution N [20 − 100].
The generated instances was obtained by the choice of Spn , T un and Class.
The pair (Spn , T un ) has many possibilities and was fixed by the values displayed
in Table 2.
Table 2. Choice of the pair (Spn , T un )
Spn T un
10 2,3,5
25 2,3,5,10,15
50 2,3,5,10,15
100 3,5,10,15
250 3,5,10,15
Based on Table 2 we have a total of 1050 instances.

We denoted by:
– LB the best (maximum) value obtained after execution of all heuristics.
– L the studied lower bound.
– GAP = LB−L L × 100.
– T ime the time spending to execute heuristic in a corresponding instances.
This time will be in seconds and we denote by “–” if the time is less than
0.001 s.
– P erc the percentage among all instances (1050) that having LB = L.
This study focuses on the comparison between the proposed lower bounds
with the LP T values given in literature review in [3].
Table 3, shows the overall percentage of lower bounds for each heuristic with
the corresponding average time.
Table 3. Overall view of lower bounds comparison
LP T R1 R2 R3 R4
P erc T ime P erc T ime P erc T ime P erc T ime P erc T ime
26.8% - 60.3% 0.010 71.3% 0.019 79.4% 0.028 93.0% 0.038
Table 4. Behavior of Gap and T ime according to Spn
Spn LP T R1 R2 R3 R4
Gap T ime Gap T ime Gap T ime Gap T ime Gap T ime
10 1.97 - 0.60 0.001 0.14 0.002 0.03 0.002 0.01 0.003
25 1.57 - 0.58 0.002 0.28 0.004 0.14 0.006 0.04 0.008
50 0.87 - 0.23 0.004 0.10 0.007 0.04 0.011 0.04 0.014
100 0.32 - 0.17 0.009 0.10 0.018 0.06 0.027 0.01 0.036
250 0.07 - 0.03 0.034 0.02 0.066 0.01 0.099 0.00 0.131
Table 3 shows the heuristic that achieves the best lower bound is R4 with
P erc = 93% in an average time equal to 0.038s, Compared to LP T heuristic
which have only 26.8%.
The performance measure based on spare parts Spn is given in Table 4. The
given results show that when varying Spn , the performance of the developed
heuristics varies, for example for heuristics LP T and R1 when Spn increases,
the performance measure increases. While this wasn’t true for the heuristics R2 ,
R3 and R4 . But in general for all heuristics the best Gap value was obtained
when Spn = 250. Also this table shows that Heuristic R4 obtained the best
performance measure when Spn = 250 and heuristic LP T obtained the worst
performance measure when Spn = 10.
Table 5. Behavior of Gap and T ime according to T un
T un LP T R1 R2 R3 R4
2 0.91 - 0.10 0.002 0.01 0.003 0.00 0.005 0.00 0.006
3 1.51 - 0.64 0.009 0.22 0.016 0.05 0.024 0.01 0.032
5 0.49 - 0.05 0.009 0.03 0.018 0.02 0.026 0.02 0.035
10 1.31 - 0.54 0.013 0.29 0.026 0.15 0.038 0.04 0.051
15 0.44 - 0.18 0.015 0.10 0.029 0.08 0.043 0.04 0.057
Table 5 shows the results of the performance measure based on number of tur-
bines T un . The worst performance measures where obtained for heuristic LP T
when T un = 3 and the best performance measures where obtained for heuristics
Table 6. Behavior of Gap and T ime according to Class
Class LP T R1 R2 R3 R4
1 0.59 - 0.08 0.010 0.05 0.019 0.03 0.029 0.02 0.037
2 0.85 - 0.24 0.010 0.11 0.019 0.06 0.028 0.03 0.038
3 0.99 - 0.57 0.010 0.22 0.018 0.09 0.028 0.01 0.037
4 1.13 - 0.21 0.010 0.08 0.019 0.05 0.028 0.03 0.037
5 1.11 - 0.49 0.010 0.22 0.019 0.08 0.028 0.02 0.038
R4 and R3 when T un = 2. Also it can be noted that increasing T un doesn’t

significantly affect the performance of the developed heuristics. For example the
performance measure of R4 varies from 0.00 to 0.04 when increasing the number
of given turbines T un from 2 to 15. Also, it can be noted that heuristic R4 yields
the best performance in this test too.
Table 7. Gap details for all lower bounds
Spn T un LP T R1 R2 R3 R4
10 2 1.24 0.01 0.00 0.00 0.00
3 4.39 1.80 0.43 0.08 0.02
5 0.29 0.00 0.00 0.00 0.00
25 2 1.42 0.30 0.02 0.00 0.00
3 1.53 0.67 0.34 0.10 0.01
5 1.55 0.17 0.10 0.09 0.08
10 3.29 1.67 0.91 0.46 0.06
15 0.07 0.07 0.00 0.05 0.05
50 2 0.07 0.00 0.00 0.00 0.00
3 1.31 0.52 0.21 0.03 0.00
5 0.46 0.07 0.03 0.02 0.01
10 1.42 0.27 0.14 0.07 0.08
15 1.08 0.29 0.14 0.09 0.09
100 3 0.24 0.17 0.10 0.04 0.00
5 0.11 0.03 0.01 0.01 0.01
10 0.43 0.19 0.10 0.05 0.01
15 0.51 0.29 0.21 0.14 0.01
250 3 0.06 0.05 0.03 0.02 0.00
5 0.03 0.01 0.00 0.00 0.00
10 0.08 0.03 0.02 0.01 0.00
15 0.09 0.05 0.03 0.02 0.00
Table 6 explores the performance measure based on Class. The performance

of heuristics R1 , R2 , R3 and R4 was much better than the performance of LP T .
Increasing the value of Class doesn’t significantly affect the performance mea-
sure for most of the heuristics and as was found in the previous tests, heuristic
R4 achieved the best results.
For more details about the gap values, we display Table 7. This table shows
that the maximum gap is 4.39 and was obtained when Spn = 10 and T un = 3.
5 Conclusion
For the problem of maximization of the minimum completion maintenance time

in case of identical turbines. Several heuristics were performed and tested by 1050
instances for parts lifespan. The heuristics were based on randomized methods
with several variants. The results showed that the heuristic R4 has the best
performance measure based on the used instances.
Acknowledgement. The author would like to thank the Deanship of Scientific

Research at Majmaah University for supporting this work.
References
1. Edmunds, D.B.: Modular engine maintenance concept considerations for aircraft
turbine engines. Aircr. Eng. Aerosp. Technol. 50(1), 14–17 (1978)
2. Gharbi, A.: Scheduling maintenance actions for gas turbines aircraft engines. Con-
straints 10, 4 (2014)
3. Haouari, M., Jemmali, M.: Maximizing the minimum completion time on parallel
machines. 4OR 6(4), 375–392 (2008)
4. Lawler, E.L., Lenstra, J.K., Kan, A.H.R., Shmoys, D.B.: Sequencing and scheduling:
algorithms and complexity. Handb. Oper. Res. Manag. Sci. 4, 445–522 (1993)
5. Mokotoff, E.: Parallel machine scheduling problems: a survey. Asia-Pac. J. Oper.
Res. 18(2), 193 (2001)
6. Tan, Z., He, Y., Epstein, L.: Optimal on-line algorithms for the uniform machine
scheduling problem with ordinal data. Inf. Comput. 196(1), 57–70 (2005)
7. Walter, R., Lawrinenko, A.: Effective solution space limitation for the identical par-
allel machine scheduling problem. Technical report, Working Paper (2014). https://
fhg.de/WrkngPprRW
8. Walter, R., Wirth, M., Lawrinenko, A.: Improved approaches to the exact solution
of the machine covering problem. J. Sched. 20(2), 147–164 (2017)
9. Woeginger, G.J.: A polynomial-time approximation scheme for maximizing the min-
imum machine completion time. Oper. Res. Lett. 20(4), 149–154 (1997)
Robust Design of Pumping Stations
in Water Distribution Networks
Gratien Bonvin(B) , Sophie Demassey, and Welington de Oliveira
Center for Applied Mathematics, Mines ParisTech, PSL Research University,

Sophia Antipolis, France
{gratien.bonvin,sophie.demassey,
welington.oliveira}@mines-paristech.fr
Abstract. Restricted to gravity-fed networks, most water network

design models minimize investment costs under a static peak water
demand scenario. In networks equipped with pumping stations, design
models should also account for operation costs incurred by the pump
energy consumption that depends on dynamic demand and tariff. Eval-
uating the lifetime operation costs amounts to solve a large-scale non-
convex combinatorial optimization problem for each considered design.
In this paper, we address the pressurized water network design problem
with a joint optimization of the pump investment and operation costs
through a stabilized Benders’ decomposition. To reduce the complexity
of the operational subproblem, we decompose the scheduling horizon in
representative days, and relax the discrete and non-convex components
of the hydraulic model. We also evaluate the design robustness on stress-
day scenarios and derive feasibility cuts using a dominance argument.
Experiments on a typical rural branched water distribution network with
one year of historical data show the accuracy of our approximations and
the significant savings expected from the optimal pump resizing.
Keywords: Pressurized water network design ·

Stabilized Benders’ decomposition ·
Mixed integer nonlinear programming
1 Introduction
While the lifetime of pipes and water tanks usually reaches 100 years, the mean
lifetime of a pump is closer to 20 years. Operators of water networks must then
periodically proceed to the rehabilitation of pumping stations with the character-
istics of the other network assets already fixed. The problem is complex because,
besides the strategical level–which pump combination to install?–it requires to
investigate the operational level–how to operate the installed pumps?–to evalu-
ate and minimize the lifetime costs over the set of pump combinations. More-
over, evaluating the minimum operation costs brings into play, together, dynamic
water demand and energy tariff profiles, discrete pump scheduling decisions, non-
convex hydraulic laws, and uncertain long-term forecasts. Actually, solving the
https://doi.org/10.1007/978-3-030-21803-4_95
958 G. Bonvin et al.
operational subproblem, known as pump scheduling problem, deterministically

on a daily horizon is already considered challenging (see e.g. [7]). Optimiza-
tion methods dedicated to the design of water distribution networks have been
proposed for more than four decades. A common approach is to combine an
evolutionary algorithm with a hydraulic simulator (see e.g. [9,10] and references
therein). The approach deals accurately with the short-term dynamic of the net-
work operation, but it is inherently a heuristic and provides no performance
guarantee. Mathematical programming approaches handle the hydraulic explic-
itly as non-linear constraints to address the pipe layout design problem [2,4,12].
They provide guarantee, but neglect the dynamic by evaluating the feasibility
of operating the network on a static worst-case water demand scenario.
In this paper, we address the pumping station design problem precisely,
with an optimization approach combining the strengths of both previous strate-
gies. While the overall approach is generic, the operational level is presented
in the context of the FRD network, a branched network that is described in
details in [3]. The approach is built on a decomposable formulation of the prob-
lem including the operation scheduling subproblem as a large-scale non-convex
mixed-integer non-linear program with uncertain data. We apply a stabilized
Benders’ decomposition [1] to solve the problem and propose different approx-
imations for the subproblem to reduce its complexity while maintaining some
performance guarantees. First, we decompose the multi-year scheduling horizon
in a restricted set of seasonal representative days. This common assumption
in long-term planning models fits well with the daily periodicity of drinking
water distribution, and allows to separate the scheduling subproblem in inde-
pendent daily subproblems (once a pump configuration is given). As optimizing
iteratively each subproblem would remain too time consuming, we propose to
relax the integrality constraints and to convexify the hydraulic constraints. This
results in convex continuous non-linear programs providing under-estimates of
the operation costs and dual information to derive Benders cuts. Second, we
handle the long-term uncertainties by enforcing the robustness of the solution
on hypothetical stress days characterized, e.g., by a high water demand and the
outage of one pump. By disregarding optimality but forcing feasibility on stress
days we can, for the class of networks in consideration, aggregate identical pumps
and, therefore, reduce the size and complexity of the subproblem. Furthermore,
we exhibit a dominance relation between pump combinations that allows to
generate multiple feasibility cuts from one infeasible solution. Our experiments
on the FRD network show the accuracy of our approximations: the impact of
the horizon decomposition is negligible while the continuous convex relaxation
induces a deviation of the optimum lower than 4.3%. Finally, the annual savings
(in terms of investment and operation costs) expected from the rehabilitation
are estimated to 32%.
The paper is organized as follows: Sect. 2 presents a formulation of the
problem. Section 3 describes the stabilized Benders’ decomposition and Sect. 4
the management of infeasibility under the dominance argument. Computational
results are provided in Sect. 5 and some conclusions in Sect. 6.
Robust Design of Pumping Stations in Water Distribution Networks 959
2 Optimal Design of a Pumping Station

This section defines our design model, by first describing how the network oper-
ates in our application case. This simpler operation model detailed in [3] is a
specialization of standard models to the network characteristics [6].
2.1 Operation of a Branched Network

The water distribution network FRD [3] can be represented as a directed graph
G = (J, L). The water flows from a source r ∈ J to elevated tanks j ∈ JT ⊆ J
through, successively: fixed speed pumps k ∈ K ⊆ L set in parallel at the
pumping station s ∈ J, directed pipes l ∈ LP ⊆ L connected by junctions
j ∈ JJ ⊆ J, and pressure reducing valves l ∈ LV ⊆ L. The dynamic state of the
JT ×T
system over a given time horizon T is driven by the water demand D ∈ R+
at the tanks and is governed by complex hydraulic laws of conservation of flow
and pressure through the network. A standard model is defined as
PTK = {(x, q, h) ∈ {0, 1}K×T × RL×T

+ × RJ×T
+ | (1) − (9)},
with x the on/off state of the pumps, q the flow through the arcs, and h the
head (sum of pressure and elevation in m) at the nodes, and:

qijt = qjit , t ∈ T , j ∈ JJ (1)
ij∈L ji∈L

qijt = Sj (hjt − hjt−1 ) + Djt , t ∈ T , j ∈ JT (2)
ij∈L
hj0 = hjT = Hj0 , j ∈ JT (3)

Hjmin ≤ hjt ≤ Hjmax , t ∈ T , j ∈ JT (4)
hit − hjt ≥ 0, t ∈ T , ij ∈ LV (5)
α min
≤ hst − hrt ≤ α max
, t∈T (6)
Qmin
κk xkt ≤ qkt ≤ Qmax
κk xkt , t ∈ T ,k ∈ K (7)
hit − hjt ≥ Φij (qijt ), t ∈ T , ij ∈ LP (8)
hst − hrt ≤ Ψκk t (qkt ) + M (1 − xkt ), t ∈ T , k ∈ K. (9)
In this model, the time horizon is discretized T = {1, . . . , T } with a reso-

lution of typically 1 hr or 2 hrs in which the system is assumed to operate in
steady state. Flows qlt and demands Djt are given in volume (in m3 ) for the
duration of a time step t ∈ T . Constraints (1) and (2) enforce the conservation
of flow at junctions and tanks. In (2), a tank j ∈ JT is assumed to be a vertical
cylinder of area Sj (in m2 ). Constraints (3) fix the initial and final volumes and
Constraints (4) the security limits with 0 ≤ Hjmin ≤ Hj0 ≤ Hjmax (in m3 ). Con-
straints (5) model the pressure reduction at the valves. Constraints (6) enforce
bounds 0 < αmin ≤ αmax on the head increase to limit leaks for instance. Let
κk denote the class of pump k ∈ K. Constraints (7) limit the pump operation
range, given 0 < Qmin
κ ≤ Qmaxκ (in m3 ), and bind flow values and pump activa-
tion states. Constraints (8) models the head loss due to friction in pipes. For each
pipe ij ∈ Lp , the head loss-flow coupling function Φij can be accurately approx-
imated by a quadratic function Φij (q) = Aij q + Bij q 2 with Aij ≥ 0, Bij ≥ 0.
Constraints (9) synchronize (given a large enough M value) the head increase
of each active pumps. Function Ψκt depends on the manufacture characteristics
κ and on the ageing t of the pump. It can be accurately fitted from operating
points as a quadratic function Ψκt (q) = ακt − βκt q 2 , with ακt ≥ 0 and βκt ≥ 0.
We highlight that the head-flow coupling constraints (8) and (9) are actually
equalities in the original–thus non-convex–formulation of the pump scheduling
problem, but it is shown in [3] that the optimality gap of this relaxation is small.
Finally, the financial cost of operation plan (x, q, h) is mainly incurred by the
purchase of the electricity consumed by pumping, namely:

CTK (x, q, h) = Ct Γκk t (xkt , qkt ), (10)
t∈T k∈K
with Ct ≥ 0 the actualized electricity price on period t and Γκt the power
consumption function for each active pump of class κ and ageing t defined as a
linear fit Γκt (xkt , qkt ) = λκt xkt + μκt qkt .
2.2 Robust Design and Relaxed Operation
In the considered water network design problem, only the set K of pumps must
be sized in a way to satisfy the future water demand D and to minimize the
global cost–sum of the actualized investment and operation costs–over a life
span T of typically 20 years. The number of pumps in K is limited by the
capacity N ∈ N∗ of the pumping station. Each pump is selected from a given
set of candidate classes κ ∈ K and acquired new at time t = 0 at a fixed
investment cost Iκ ≥ 0. We assume that the maximal efficiency and the head
increase at constant flow for all pumps decrease from 1% each year, according
to the empirical ageing model of [5]. Under this hypothesis, we assume in the
definition of Φij that ακt = (1 − 100
vt
)ακ0 and βκt = (1 − 100
vt
)βκ0 for any time
t ∈ T with vt the duration from time 0 to t in years. Also, because the power
consumption can alternatively be formulated as the product of the flow and the
head increase divided by the efficiency, we assume that it does not change in
time, i.e. λκt = λκ and μκt = μκ for t ∈ T in the definition of Γκt .
Assuming that the water demand (Dt )t∈T and the electricity price (Ct )t∈T
are given, the optimal design problem is to find a set K of at most N pumps
of classes in K and an operation plan (x, q, h) ∈ PTK of minimum investment
and operation costs. As the life span extends into years while the operation
model requires a time resolution in minutes or hours, this model is not practical
due to its complexity, its dimension and the stochasticity of the water demand.
To address these issues, we formulate a robust variant of the problem where
the time horizon is decomposed into a given set of regular (resp. peak) days
d ∈ DR (resp. d ∈ DP ). Each day d ∈ D = DR ∪ DP is characterized by

water demand and electricity tariff profiles over the daily time horizon T d which
are representative for a number Ld of days over the life span T . Thanks to
Constraints (3) (which forces the tank’s water level at the end of the day to be
equal to that of the beginning of the day) the subproblem of determining the
minimum cost operation plan (x, q, h) ∈ PTK associated to a pump combination
K is decomposed into independent daily subproblems:

min CTK (x, q, h) = Ld min CTKd (x, q, h). (11)
K
(x,q,h)∈PT (xd ,q d ,hd )∈P Kd
d∈D T
Because optimizing on PTKd remains challenging, we propose to relax some opera-

tional constraints which are not structural for the long term horizon. We consider
a different relaxation RK T d whether d is a regular or peak day. The purpose of
peak days is to enforce the robustness of the solutions by simulating stress oper-
ation conditions, including a high demand. We check the feasibility of the pump
combinations under these extreme conditions but, if feasible, we neglect the asso-
ciated operation costs as these conditions are considered as exceptional. More
specifically, we consider the sum in (11) only over the index set of regular days
DR , and we use the peak days to check feasibility (of the pump configuration),
as detailed in Sect. 3 below. In such stress days we also ignore the minimum
flow constraints in (7) by setting Qminκ = 0. Under these assumptions, all the
pumps in a class can be assumed to operate equally and the size of model RK Td
can be reduced accordingly, by aggregating the variables per pump class and
by relaxing the integrality constraints on the aggregated state variables for each
class κ ∈ K such that ακ0 ≥ αmax .
To allow a quick computation of the operation costs of a pump combination
over the regular days, as well as the use of dual information in the overall solu-
tion algorithm described in the next section, we define RK T d as the continuous
relaxation of PTKd for any regular day d ∈ DR . I.e., the feasible sets PTKd in (11)
are replaced with RK T d.
2.3 Pump Investment Variables, Constraints and Costs

As the considered network has only one pumping station, a pump combination
is uniquely determined by the number of pumps of each class it contains. An
alternative representation, with more symmetries but that is better suited to
Benders cut generation, is given by yκn ∈ {0, 1} for κ ∈ K and n ∈ {1, . . . , N }
such that yκn = 1 if and only if the combination has at least n pumps of class
κ. For a fixed ordering of the pumps, each combination corresponds to exactly
one binary vector y such that

N
yκn ≤ N, yκn ≤ yκn−1 , for all κ ∈ K and n = 2 . . . , N. (12)
κ∈K n=1
N
Let Nκ (y) = n=1 yκn be the number of pumps of class κ in the combination y.
The investment cost of acquiring Nκ (y) pumps of class κ is Iκ Nκ (y), with given
Iκ ≥ 0. Having defined all the elements related to the variables of investment

and operation, we are now in position to state the considered formulation for
the robust design optimization problem:
⎧ K
⎪ min
⎨ y,x,q,h κ∈K Iκ Nκ (y) +
d d d
d∈DR Ld CTd (x , q , h )
s.t. y binary satisfying (12) and xd ≤ y ∀d ∈ DR ∪ DP (13)

⎪
⎩ K
(x , q , h ) ∈ RTd
d d d
∀d ∈ DR ∪ DP .
The constraint xd ≤ y means that a pump is operational at day d if the
configuration y accounts its installation.
3 Benders’ Decomposition
Problem (13) has a decomposable structure: we can separate the investment
variables y from the operational ones (x, q, h) to rewrite (13) as
min f (y) s.t. y binary satisfying (12) and c(y) = 0 . (14)

y
In this formulation, the objective function f is

⎧
⎪
⎨ xdmin CTKd (xd , q d , hd )
,q d ,hd
f (y) = Iκ Nκ (y) + Ld s.t.
⎪ (xd , q d , hd ) ∈ RK Td ∀d ∈ DR (15)
⎩
κ∈K d∈DR xd ≤ y ∀d ∈ DR .
Peak days are handled by the constraint function c:

0 if there exist(xd , q d , hd ) ∈ RK
Td s.t. x ≤ y ∀d ∈ DR ∪ DP
d
c(y) =
∞ otherwise.
In the process of solving (14), both f and c are approximated by cuts.

Benders cuts. Note that function f is convex but nonsmooth. Therefore, opti-
mality cuts for (14) is nothing but linearization of f , computed by making use
of subgradients. Let ỹ be a given combination of pumps. Then a subgradient s̃
of f at ỹ is a vector of same dimension as ỹ constructed from the dual variables
associated to the constraints xd ≤ y, for all d ∈ DR , as described in [1, Lemma
1]. The optimality cut is thus given by f (ỹ) + s̃, · − ỹ
and satisfies (due to
convexity of f ) f (ỹ) + s̃, y − ỹ
≤ f (y) for all y.
Feasibility cuts. Suppose that ỹ is infeasible for problem (14). In order to exclude
such a point from the set of candidate solutions for (14) we consider the following
linear constraint κ∈K yκNκ (ỹ)+1 ≥ 1, with convention yκN +1 = 0.
3.1 Stabilized Cutting-Plane Algorithm

Let be an iteration counter and z (resp. z ) be the lower (resp. upper) bound
available at iteration for the optimal value of (14). We initialize z 0 = 0 and
z 0 = ∞ in the stabilized Benders’ decomposition of [1] to generate a sequence
{y } of be the trial points for (14). We split such a sequence according to feasi-
bility of its elements: O = {ι ∈ {0, . . . , } : c(y ι ) = 0} and F = {0, . . . , }\O .
Note that F gathers (up to iteration ) the points that have been proved infea-
sible for problem (14). By setting = 0, starting with a vector y 0 and defining
the incumbent point ŷ 0 = y 0 , our variant of [1, Alg. 4] defines trial points by
iteratively solving the following MILP:
⎧
⎪
⎪ arg min 12 1 − ŷ , y
⎪
⎨ y
y +1 ∈ s.t. y binary satisfying (12) (16)

⎪
⎪ f (y ι ) + sι , y − y ι
≤ (z + z )/2 ∀ι ∈ O
⎪
⎩
κ∈K yκNκ (y ι )+1 ≥ 1 ∀ι ∈ F .
If this master program is infeasible, then z +1 = (z + z )/2 is a valid lower
bound for (14). Otherwise, we set z +1 = z and check feasibility of z +1 : if it
is infeasible we set F +1 = F ∪ { + 1} (a new feasibility cut is added). On
the other hand, if c(z +1 ) = 0 we then update O+1 = O ∪ { + 1} (a new
Benders cut is added), compute f (z +1 ) and a subgradient s+1 , and set z +1 =
min{z , f (y +1 )}. If z > f (y +1 ) then the algorithm updates the incumbent
point: ŷ +1 = y , otherwise it remains unchanged. In any case, the algorithm
updates = + 1 and repeats this process until (z − z )/z ≤ tol, for a given
tolerance tol > 0. We refer to [1] for further details on the algorithm.
4 Infeasibility, Dominance and Fault Tolerance
In this section, we show how to accelerate the convergence of the solution algo-
rithm by using a dominance relation on the set of combinations to generate more
than one feasibility cut at a time, including in a preprocessing step. To simplify
the presentation, we fairly assume that a combination which is infeasible for a
regular day is also infeasible for a peak day. We finally integrate to the definition
of peak days a model of robustness to pump failures.
4.1 Dominance
By definition, the maximal flow a pump combination y can deliver for a given
head increase α = hs − hr corresponds to activate all Nκ (y) pumps
of each class

κ ∈ K with ακ0 > α. It is thus equal to Qy (α) = κ∈K Nκ (y) max(0, ακ0βκ−α ).
We say that a combination y dominates y , and we note y y, if Qy ≤ Qy on
the allowed operation range [αmin , αmax ]. The following proposition shows that
all combinations dominated by an infeasible combination are infeasible too.
Proposition 1. For any combinations y and y and peak day d ∈ DP , if y y
and y is infeasible for (14), so is y .
Proof. For any combination y, function Qy is strictly decreasing with the head
increase, then Ay = {(q, α) ∈ R2 | 0 ≤ q ≤ Qy (αmin ), αmin ≤ α ≤ Q−1 y (q)}
identifies the set of pairs of total flow and head increase which can be operated
by y at any time t (because we relaxed lower bound Qmin κ on peak days). Then
for any point in Rd (y ) = {(x , q , h ) ∈ RK
Td s.t. xd
≤ y
∀d ∈ DP }, which is built
on a sequence of Td elements of Ay ⊆ Ay , there exists another point Rd (y).
4.2 Generation of Feasibility Cuts

Suppose that, in the cutting-plane algorithm, a candidate combination y +1 is
proved to be infeasible on some peak day. Proposition 1 implies that several
feasibility cuts can be generated at the same iteration instead of a single one:
it amounts to search for combinations that are dominated by y +1 . Dominance
clearly defines a partial order relation on the set of combinations. The dominance
of y by y can be checked by proving that Qy − Qy ≥ 0 at each point of non-
differentiability {αmin , αmax } ∪ {ακ | κ ∈ K}. However, the class of dominated
combinations can reasonably be computed only for small values of |K| or N . We
propose to compute it in a heuristic way by observing that, if pκ denotes the
power
of a pump κ at its maximal efficiency then the reference power P (y) =
κ∈K Nκ (y)pκ of y is likely greater than the reference power of the combinations
it dominates. In our implementation, we first compute and sort the list C of
combinations of at most N pumps by increasing order of reference power. When
y is proved to be infeasible, we iterate on the combinations with lower reference
power, by decreasing order, and check if they are dominated. Each dominated
combination is then used to enlarge the set F in (16) indexing feasibility cuts.
As a preprocessing step for the algorithm, to accelerate its convergence, we
also propose to initialize F 0 by greedily evaluating the list C. We iteratively pick
up a combination y ∈ C of median value P (y) and check the feasibility of the
continuous relaxation of RK T d on peak days d ∈ DP . If y is infeasible, then we
updated F 0 as previously. Otherwise, we remove all y from C with P (y ) ≥ P (y)
s.t. y y as they are likely feasible too.
4.3 Robustness to a Pump Outage

Finally, we propose to add another degree of robustness to the design problem
by simulating the inoperability of one pump on peak days (due to possible failure
or maintenance). To that purpose, we redefine that a combination y is feasible
for (14) if all variations of y defined by setting off one pump (of any type) on
peak days are feasible. This new assumption only impacts (and speeds up) the
algorithms described above to initialize and update the index set F of feasibility
cuts.
5 Numerical Assessments on a Real-Life Instance

In this section, we consider the rural drinking water distribution network FRD
investigated in [3]. The resulting design problem (13) were constructed with data
from the historical demand profiles of the year 2013 (see [3] for more details). We
assume that all the 6 existing pumps should be replaced by at most N = 6 new
ones selected in the KSB manufacturer catalog [8]. We have preselected a set K
of 19 classes of pumps compatible with the network allowed range of pressure
[αmin = 91, αmax = 140]. For each class κ ∈ K, we interpolated curves Ψκ and Γκ

pκ 0.67
from the catalog data and estimated the investment cost Iκ = 4 400 74.6 +

pκ 0.77
19 300 52 following ([11], Table 9-50). Considering a planning horizon of
T =20 years, we built a set DR of 12 regular days, each representative of a week
day or a week-end in six 2-months periods (January-February, March-April,...)
20
of any year. For each regular day d ∈ DR , we define Ld = L0d l=1 (1 + τ )1−l
with L0d the number of days represented by d in 2013 and τ = 5% the discount
rate. Demand and tariff profiles Dd and C d for a regular day d are computed in
average over the L0d represented days of 2013. The unique peak day is built from
the day in 2013 with the highest instantaneous demand and by initializing the
tanks at their minimum level H min . We fix the time resolution (duration of the
time steps) to 2 h for regular days and to 4 h for peak days.
The computations were performed on a Xeon E5-2650V4 2.2 GHz with 254
GB RAM. The algorithms were implemented in Python (with tol = 10−3 ), and
the master and slave programs solved with Gurobi 7.0.2.

The preprocessing algorithm identified 1,902 infeasible combinations of at most
5 pumps (among 42,504) in 275 s, 83% of the time being to compare combi-
nations. This resulted in an initial set F 0 of 20,757 infeasible combinations of
at most 6 pumps, which represent 96% of F at the last iteration of the sta-
bilized cutting-plane algorithm. This suggests that the preprocessing algorithm
separates feasible and infeasible combinations with good accuracy.
The cutting-plane algorithm ran in 732 s and 37 iterations (28 infeasible
candidates). In average, the time of one iteration can be roughly decomposed in
12 s for solving the master program, 6 s for the twelve regular slave programs
(which could be reduced by parallelization), and 3 s for the peak slave program
and to compute the dominated infeasible combinations.
To evaluate the quality of our approximations, we computed the non-relaxed
operation cost of the optimal combination y ∗ when optimizing on the original
operation scheduling model PTK (using the heuristic in [3]) on the 12 regular
days of 2013 and on the 365 days of year 2013. By applying the discount rate,
we get the operation costs over the 20 years: 460,388 euros in the former case
and 460,247 in the latter case. This negligible deviation suggests that the chosen
number of representative days is enough. By adding the investment costs on the
12-day case, we get a lifetime cost of 598,748 euros. Compared to the final lower
bound z = 570, 898, it gives an optimality gap of 4.3% which confirms that the
continuous relaxation Rd offers a good estimate of the operation cost.
Finally, we compared the operation of network FRD before and after resizing
the pumping station according to the optimal combination y ∗ . First, the number
of pumps is reduced from 6 to 5 pumps and the average reference power from 127
to 66 kW. The purchase cost of the new combination is then half the present
value of the installed combination. Second, the operation costs for year 2013,
when estimated with PTK , are reduced by 24.3%, a reduction that is mainly
driven by a better usage of the pumps (+18.9% of the average efficiency) which
proves the better adequation to the demand.
6 Conclusion
In this paper, we tackled the water network design problem in the context of
networks equipped with pumping stations, where the operation costs must be
estimated dynamically in addition to the investment costs. We formulated differ-
ent approximations to estimate the operation costs to be embedded in a stabi-
lized Benders’ decomposition approach. We also ensured a certain robustness of
the solutions to stress operation conditions and derived dominance arguments to
accelerate the cutting-plane algorithm. Experimented on a realistic instance, the
approximations turned to be accurate and the algorithm fast. While some fea-
tures of our implementation are specific to the FRD network, and more generally
to a class of branched network defined in [3], the method can be generalized to a
variety of water networks after identifying an accurate continuous relaxation for
the operation problem. The potential higher complexity and larger optimality
gap should however be evaluated in practice.
References
1. van Ackooij, W., Frangioni, A., de Oliveira, W.: Inexact stabilized benders’ decom-
position approaches with application to chance-constrained problems with finite
support. Comput. Optim. Appl. 65(3), 637–669 (2016)
2. Alperovits, E., Shamir, U.: Design of optimal water distribution systems. Water
Resour. Res. 13(6), 885–900 (1977)
3. Bonvin, G., Demassey, S., Le Pape, C., Maı̈zi, N., Mazauric, V., Samperio, A.: A
convex mathematical program for pump scheduling in a class of branched water
networks. Appl. Energy 185, 1702–1711 (2017)
4. Bragalli, C., D’Ambrosio, C., Lee, J., Lodi, A., Toth, P.: On the optimal design
of water distribution networks: a practical MINLP approach. Optim. Eng. 13(2),
219–246 (2012)
5. Bunn, S.: Ageing pump efficiency: the hidden cost thief? In: Distribution Systems
Symposium & Exposition (2009)
6. Burgschweiger, J., Gnädig, B., Steinbach, M.: Optimization models for operative
planning in drinking water networks. Optim. Eng. 10(1), 43–73 (2009)
7. D’Ambrosio, C., Lodi, A., Wiese, S., Bragalli, C.: Mathematical programming tech-
niques in water network optimization. Eur. J. Oper. Res. 243(3), 774–788 (2015)
8. KSB: Multitec : High-pressure pumps in ring-section design-booklet with perfor-
mance curves. https://shop.ksb.com/
9. Marques, J., Cunha, M., Savić, D.A.: Multi-objective optimization of water distri-
bution systems based on a real options approach. Environ. Model. Softw. 63, 1–13
(2015)
10. Murphy, L., Dandy, G., Simpson, A.: Optimum design and operation of pumped
water distribution systems. In: Conference on Hydraulics in Civil Engineering
(1994)
11. Perry, R., Green, D., Maloney, J.: Perry’s Chemical Engineers’ Handbook, 7th edn.
McGraw-Hill (1997)
12. Raghunathan, A.U.: Global optimization of of nonlinear network design. SIAM J.
Optim. 23(1), 268–295 (2013)
Engineering Systems
Application of PLS Technique to Optimization
of the Formulation of a Geo-Eco-Material
S. Imanzadeh1,2(&), Armelle Jarno2, and S. Taibi2

1
Normandie Univ., INSA Rouen Normandie, LMN, 76000 Rouen, France
saber.imanzadeh@insa-rouen.fr
2
Normandie Univ., UNIHAVRE, CNRS, LOMC, 76600 Le Havre, France
{jarnoa,Said.Taibi}@univ-lehavre.fr
Abstract. Earthen construction is one of the most familiar construction method

practiced since the ancient times. It has been successfully used around the world.
The raw earth material as a geo-eco-material is one of the most abundant, basic
building materials. It is low technology, straightforwardly worked with simple
tools. It requires very low energy to manufacture through low environmental
impact, especially when the material is sourced on construction site. Nowadays,
building construction with raw earth materials needs remarkable mechanical
performance. For this, a raw earth treatment with binders is one of the tech-
niques used to make better its strength and durability. This paper presents the
use of Design of Experiments through D-optimal mixture design as a tool to
optimize a raw earth concrete formulation to achieve a desirable compressive
strength. A multivariate statistical regression technique of PLS, Partial Least
Square projections to latent structures, was chosen to evaluate the design.
This PLS technique was selected because of the complicated experimental
design data along with different constraints on model. The obtained results
illustrate that PLS technique can be a useful tool to improve and optimize a raw
earth concrete formulation.
Keywords: Geo-eco-material Unconfined compressive strength Design of

experiment PLS technique Optimization Response surface method
1 Introduction
Since the earliest civilizations settled on the continent, earth has been used as a major
building material. Raw earth material is attracting renewed interest thanks to its green
characteristics with very low energy for its transformation into building material
compared to conventional building materials like concrete. It is recyclable and it
requires only simple technology. Building with raw earth, which is a local material, is a
solution to avoid energy-intensive transport [1].
The principal properties of raw earth material are the mechanical strength, the
shrinkage and swelling, the hygrothermal properties and the cracking [2]. It is not often
adequate to achieve the performance needed for a building material and different
stabilizers are used to improve its properties [3]. Among them, when the mechanical
strength of a raw earth material is concerned, stabilizers like lime, gypsum and cement

https://doi.org/10.1007/978-3-030-21803-4_96
972 S. Imanzadeh et al.
can be used. Some studies are concerned with the influence of these binders on the raw
earth properties in order to produce an improved construction material [4]. As an
example, Zak et al. [1] studied the effect of reinforcement by natural fibers, gypsum and
cement on compressive strength of unfired earth bricks materials. They showed that the
mixing of earth material with gypsum has no favorable effect on the compressive
strength.
A new concrete made of raw earth was recently developed by a French firm from
Normandy called Cematerre, in working with University of Le Havre Normandie. Its
originality and its advantage is its ability to be cast in place as a conventional concrete.
This raw earth concrete as an Eco-geo-material is composed of four components: silt,
lime, cement and water. The principal goal of this paper is to optimize the formulation
of raw earth concrete to improve the mechanical strength.
The Design of experiments, DOE, can be a good approach to search such a for-
mulation depending on the proportion of mixture components [5]. To satisfy this goal,
combinations of four-constituent mixtures were formulated using the D-optimal mix-
ture design. The experimental domain was determined according to some constraints.
Several series of laboratory tests were performed to establish model formulations tar-
geting the sought compressive strength after 90-day curing-time. Thereafter, the
derived model was validated. A multivariate statistical regression method of PLS,
Partial Least Square projections to latent structures, was chosen to examine the design
[6, 7]. This PLS technique was chosen because of the complicated experimental design
data due to different constraints on model. The Partial Least Square projections on
latent structures were calculated to estimate the coefficients of the model. Finally,
results were analyzed to improve and optimize formulations of raw earth concrete
materials using the loading plot of PLS technique and response trace plot.
2 Design of Experiment with PLS Technique
Design of experiments, DOE, is a practical approach for exploring complex problems.

It is the design of any task that aims to describe the variation of information under
conditions to reflect the variation. The term is usually associated with experiments in
which the design introduces conditions that directly influence the variation [8]. Based
on statistics analysis, it gives the maximum amount of relevant information with a
chosen number of experimentations. Different kinds of DoE exist such as full factorial,
fractional, composite and D-optimal designs [9].
Several classical methods of statistics exist to examine the design of experiment.
An example is Multiple Linear Regression; MLR, [10] a commonly used regression
tool. As a predictive analysis, the Multiple Linear Regression is applied to describe the
relationship between one continuous dependent variable and two or more independent
variables by fitting a linear equation to measured values. However, there are some
drawbacks with MLR technique: (i) It has problems with regard to missing data. If
some data element is lacking that row and that column must be eliminated from the
modeling. (ii) It is not adapted to complicated experimental data design through dif-
ferent constraints on model. Taking into account these drawbacks, a multivariate
Application of PLS Technique to Optimization 973
statistical regression method of PLS, Partial Least Square projections to latent struc-
tures, is a better choice than MLR technique.
The PLS technique was originally introduced in chemometrics and economics [11].
The pioneering work dealing with PLS was performed by Wold in the field of
econometrics [6]. It has been successfully employed in other scientific areas, such as
medicine, bioinformatics, computer vision and civil engineering. PLS model represents
approximations of an underlying complicated reality. The aim of PLS technique is to
predict a set of response variables from a set of predictor variables through latent
variables. It is commonly followed through an iterative process. PLS technique has two
basic advantages. PLS produces low-rank approximations of the data that are aligned
with the response and then uses low-rank approximations of both the input data and
response data to estimate the final regression model. The PLS technique is decomposed
in two parts: linear and non-linear PLS. A major limitation of linear PLS is that some
problems are often displaying nonlinear characteristics; hence using the nonlinear PLS
is appropriated [6].
Associated with PLS technique, some tools such as scores and loadings can be
performed to interpret the model. Within the PLS modelling framework, an important
attention is given on the plotting of model parameters like scores and loadings, since
such plots are constructive and useful for model verification and interpretation [12].
The analysis with PLS technique provides some advantages [7]: (i) there is no
limitation by the number of experiments and the degree of freedom of the model. (ii) if
there are several response variables, their covariance is taken into account in the model.
(iii) in addition to the results given by the effect or response surface analysis, PLS
technique provides tools to detect outliers. As each experiment (except the center one)
is often not replicated, it is difficult to locate an outlier result in classical method.
The PLS technique has been largely explained in the literature [13] and its use to
mixture data was detailed by Kettaneh-Wold [14].
3 Eco-Geo-Materials
Eco-geo-materials analyzed herein are composed of four constituents: silt, lime, cement
and water. The used soil material for a raw earth concrete is natural silt. Silt is granular
material of a size between sand and clay, whose mineral origin is quartz and feldspar
[15]. It was selected because locally available material in abundance on the con-
struction site. For this natural silt, the effective diameter, the Hazen uniformity factor
and the curvature factor are respectively equal to 32 µm, 4.37 and 0.94. The Atterberg
limits are respectively 20% for the liquid limit and 6% for the plasticity index [16].
Lime and Portland cement as two binders are used to produce a raw earth concrete. The
used lime comes from the Proviacal® DD range. The used cement is CEM I 52.5 N,
with respect to the NF EN197-1, NF P15-318 and NF EN196-10 standards. It is a
Portland cement, composed of 95 to 100% of clinker, with an unconfined compressive
strength (UCS) determined at 28 days of 52.5 MPa (lowest value), with an ordinary
short-term strength class (2 or 7 days). More information about these two binders’
properties was given in details in Eid 2017 [17]. For preparing the raw earth concrete, a
potable tap water in the laboratory was used.
The mixing procedure was performed in a laboratory mixer with a capacity of four
liters. Then, the molds of 100 mm in height and 50 mm in diameter were filled by
vibration for two minutes with a vibrating table. Thereafter, the specimens were stored
for 90-day curing-time in controlled laboratory environment. Laboratory prepared raw
earth concrete specimens with different mix proportion were examined, conducting
Unconfined Compressive Strength (UCS) test. The specimens were sheared on
unconfined compressive strength path respecting to NF P94-420 and NF P94-425
French standards. The unconfined compressive strength test is performed to present
stress-strain curve.
4 Raw Earth Concrete Mixture Design
Quadratic polynomial model was chosen for four-constituents to obtain the model that
satisfy the best fit to the experimental measurements then to make predictions of the
unconfined compressive strength (UCS) of a raw earth concrete whatever the mixture
of constituents. The quadratic model considers binary interaction influence for all
possible pairs of components (Eq. 1).
X
n X
i;j¼n
UCS ¼ a0 þ ai x i þ aij xi xj ðQuadratic model) ð1Þ
i¼1 i;j¼1
where UCS is the response, in the other word, the unconfined compressive strength of a
raw earth concrete in MPa. In Eq. 1, a0 is a constant. n is the number of constituents. xi
and xj are the quantity of constituents in weight percent. ai the regression coefficients
for the linear terms and aij are the regression coefficients for the binary interaction
parameters.
The Analysis Of Variance (ANOVA) involves several diagnostic tools to confirm
the validity of the models [18, 19]. The two first diagnostic tools are the regression
coefficients, R2 and R2 adj , giving the information on the ability of the model to fit the
measured values. Furthermore, Q2 coefficient evaluates the model validity that means
its ability to predict new data. Afterwards, the first F-Test and the second F-Test should
be evaluated. The first F-Test is the regression model significance test. It compares
regression variance to residual variance. For the second F-Test, residual error is
structured in two parts: lack of fit due to imperfection of the model and pure error
estimated from replicates data error [20].
Three constraints must be considered:
(1) Fundamental constraint where the sum of the components of the mixture is equal
to 100% in weight for all the mixes of the design,
(2) Economic and ecological mixture constraints: the raw earth concrete should be
designed to offer the convenient mechanical properties to be used as a con-
struction building material. In addition, it should be non-energy-intensive. For this
reason, the amount of these two binders, cement and lime, should be limited.
Therefore, the maximum amounts of cement and lime are respectively limited to
16% and 12% and a condition of maximum binder proportion was fixed: Cement
% + Lime % < 16%. In agreement with above-mentioned constraints, the mixing
range selected for each of the constituents, is presented in Table 1.
(3) Workability constraint: workability has an important role on the mechanical
properties. Some properties like cohesion, plasticity and consistency can affect the
workability of a raw earth concrete. In this research paper, a S3 level of consis-
tency calibrated by standard slump test was performed to confirm a fluidity near to
a very plastic concrete in conformity with the standard NF EN 206-1.
Table 1. Mixing range

xi Lower limit (%) Upper limit (%)
x1: Silt 42 76
x2: Lime 0 12
x3: Cement 4 16
x4: Water 20 30
In this study, the experimental region is constrained by the three above-mentioned

conditions to an irregular 3D-polyhedron. Thereafter, D-optimal design based on a
computer-aided is adapted to create the set of experiments [5]. The 21 formulations
(Table 2) were performed in the random run order proposed by MODDE© software
with three repetitions including three center points.
Once tests were completed, a multivariate statistical regression method of PLS,
Partial Least Square projections to latent structures, was applied to examine the design.
PLS technique was appropriated because of the complicated experimental design data
along with different constraints on model. The Partial Least Square projections on
latent structures were calculated to determine the coefficients of the model.
5 Model Validation
Quadratic polynomial model was first fit to the measured experimental values ensuing a
thorough examination of R2, R2 adj and Q2 to find the most adequate model representing
the measured experimental data. The values of R2, R2 adj and Q2 are respectively 0.979,
0.976 and 0.978 for 90-day curing time. For this curing time, the values of R2, R2 adj can
be considered as good. In addition, this model illustrates a very good predictive rele-
vance (Q2 > 0.9). Furthermore, it was found that the chosen model passed the first F-
Test and the second F-Test. Thus, selected model can be accepted as valid. Once the
best-fitting model was chosen for this curing time, an equation explaining the pre-
diction of Unconfined Compressive Strength (UCS) was obtained for raw earth con-
crete formulations. The regression coefficients of the model were shown in Table 3.
Table 2. Experimental design of the D-optimal design for a raw earth concrete (xi is the
proportion of the different mixture constituents)
Formulation Silt (x1) Lime (x2) Cement (x3) Water (x4)
1 0.7283 0.0000 0.0400 0.2317
2 0.5784 0.1200 0.0400 0.2616
3 0.5784 0.1200 0.0400 0.2616
4 0.6088 0.0000 0.1600 0.2312
5 0.6088 0.0000 0.1600 0.2312
6 0.7381 0.0000 0.0400 0.2219
7 0.7381 0.0000 0.0400 0.2219
8 0.5882 0.1200 0.0400 0.2518
9 0.6185 0.0000 0.1600 0.2215
10 0.6185 0.0000 0.1600 0.2215
11 0.6784 0.0400 0.0400 0.2416
12 0.6284 0.0800 0.0400 0.2516
13 0.6382 0.0800 0.0400 0.2418
14 0.5886 0.0800 0.0800 0.2514
15 0.5987 0.0400 0.1200 0.2413
16 0.5983 0.0800 0.0800 0.2417
17 0.6084 0.0400 0.1200 0.2316
18 0.6483 0.0400 0.0800 0.2317
19 0.6434 0.0400 0.0800 0.2366
20 0.6434 0.0400 0.0800 0.2366
21 0.6434 0.0400 0.0800 0.2366
Table 3. Model coefficients for the unconfined compressive strength of raw earth concrete
Coefficients Quadratic model (curing time: 90-day)
a0 18.19
a1 −11.52
a2 −286.01
a3 326.28
a4 6.38
a12 210.62
a13 −106.54
a14 −84.12
a23 −355.60
a24 660.80
a34 −765.81
6 Results and Discussion

6.1 Loading Plot of PLS Technique
Scores and loading plots are derived from the model fitted with PLS regression.
Three PLS components were used in this study. In the following, the loading plot was
used to study the influence of the different terms on UCS response. The plot of X
(constituents) and Y (response) weights (w and c, respectively) illustrates how the X
variable influences the Y variable, and the correlation structure between X and Y
variables. It gives a representation of how the X and Y variables combine in the
projections, and finally inform on the relation between X and Y variables.
This plot simplifies the analysis of how the response varies in relation to con-
stituents and the existing interactions between them. Furthermore, this plot gives a
direct indication of which constituent or interaction between constituents contribute
significantly to unconfined compressive strength response.
Let us interpret how the four constituents affect UCS response. First, a base line is
drawn between UCS and the plot origin (0, 0). Second, all constituents and the
interactions between constituents are projected orthogonally onto this line. The relative
position of the projections to the origin informs on the significance of the different
terms in the model and on their positive or negative correlation with the studied
response. The more the orthogonal projection of the terms deviates from the origin (0,
0), the more they influence the unconfined compressive strength. Figure 1 illustrates
both the constituents weights (w) and response weight (c), for the first and the second
significant PLS components (wc [1], wc [2]). One sees how the constituents relate to
response. This figure shows that cement, silt-lime and silt-water interaction terms are
positively related to UCS since they are projected on the same quadrant than UCS. It is
Fig. 1. Interpretation of the unconfined compressive strength response, UCS, by a PLS loading
plot
found that the distance of the orthogonal projection to origin for cement constituent is
longer than for these interaction terms. Cement is clearly and obviously, the main factor
that contributes to increase UCS. Indeed, when the amount of cement increases, the
unconfined compressive strength also increases. The positive effect of the silt-lime
interaction on UCS is slightly stronger than the silt-water interaction. On the contrary,
lime and water constituents and both lime-cement and cement-water interactions act
with a negative impact on the unconfined compressive strength. Finally, silt compo-
nent, lime-water and silt-cement interactions have a little impact on UCS. They can be
neglected because of their small values of Variable Importance for Projection
(VIP) less than 0.8 [6].
6.2 Response Surface Plot of UCS-Optimization

In this section, the goal is to find the best formulation of the material fulfilling a
threshold of acceptable unconfined compressive strength (UCS) around 4.75 MPa for
90-day curing time, better addressing ecologic and economic constraints. Response
surface plots and projections of response surface plots are used for this phase of
optimization. It was first decided to reinforce economic constraint by reducing the
cement and lime proportions till condition on UCS was fulfilled. Response surface plot
for 8% of cement is illustrated in Fig. 2. As can be noted, the response domain is
limited because of the applied constraints (see Sect. 4). Iso-values of unconfined
compressive strength vary from 4.5 and 5 MPa. Thus, different mixtures can meet a
resistance criterion within a few percent for a threshold UCS equal to
4.75 ± 0.25 MPa. As shown in Fig. 2, all the set of mixtures available in the
Fig. 2. Response surface contour plot of unconfined compressive strength for 90-day curing
time (cement = 8%)
experimental constrained domain satisfy this requirement. Then it is possible to opti-

mize the eco-geo-material formulation based on surface plots. Minimizing the binder
proportion constitutes (lime + cement) is one objective in terms of eco-friendly
materials search. In addition, another important criterion is the minimization of water
content to reduce shrinkage risks. If the formulations of A and B materials spotted on
Fig. 2 are compared, it can be demonstrated that for formulation B, the binder content
is significantly reduced to *17% with a decrease of water content of *5% and an
increase of silt content about 5% (Table 4).
Consequently, the best formulation available in this constrained experimental
domain, according to defined criteria is the formulation C corresponding to zero
amount of lime with reduced binder and water amounts respectively about 33.3% and
7% and with an increase of silt content about 9% (Table 4).
Table 4. Three formulations of raw earth concrete for 8% of cement proportion

Formulation Water (%) Silt (%) Lime (%) Binders (%)
(Lime + Cement)
A 24 64 4 12
mixture
B 23 67 2 10
C 22.3 69.8 0 8
7 Conclusions
In this research study, it was shown that Design of Experiments method is an opti-
mization tool adapted to examine the unconfined compressive strength of a raw earth
material. The design tests were determined by considering three different constraints:
ecological, economical and workability. A multivariate statistical regression technique
of PLS was selected to evaluate the design. This PLS technique was applied because of
the complicated experimental design data together with different constraints on model.
Thanks to the loading plot of PLS technique, the negative or positive roles of each
constituent and existing interactions between constituents were pointed out. For
example, the model has shown the unfavorable role of lime on UCS response; on the
other hand, favorable role of silt-lime interaction on UCS response was illustrated. This
could be explained by the fact that, when the percentage of lime exceeds a certain
threshold, it has insignificant effect on UCS and it behaves like an addition of fine
particles in the raw earth material, which reduces UCS. However, silt-lime interaction
effect on UCS is favorable because the lime reacts in two steps, first the cationic
exchange within silt particles changes the granulometry of silt, and thereafter the
pozzolanic reactions produce a cementation between the silt particles so increasing the
UCS. The eco-geo-material formulation was optimized based on surface plots. In the
presence of complicated experimental design data with different constraints on model,
PLS is a good alternative, to the more classical multiple linear regression for predicting
a suitable mixture composition.
References
1. Zak, P., Ashour, T., Korjenic, A., Korjenic, S., Wu, W.: The influence of natural
reinforcement fibers, gypsum and cement on compressive strength of earth bricks materials.
Constr. Build. Mater. 106, 179–188 (2016)
2. Ashour, T., Korjenic, A., Korjenic, S., Wu, W.: Thermal conductivity of unfired earth bricks
reinforced by agricultural wastes with cement and gypsum. Energy Build. 104, 139–146
(2015)
3. Carmen Jimenez Delgado, M., Canas Guerrero, I.: Earth building in Spain. Constr. Build.
Mater. 20, 679–690 (2006)
4. Al-Mukhtar, M., Lasledj, A., Alcover, J.F.: Lime consumption of different clayey soils.
Appl. Clay Sci. 95, 133–145 (2014)
5. Herrero, A., Ortiz, M.C., Sarabia, L.A.: D-optimal experimental design coupled with parallel
factor analysis 2 decomposition a tool in determination of triazines in oranges by
programmed temperature vaporization-gas chromatography-mass spectrometry when using
dispersive-solid phase extraction. J. Chromatogr. A 1288, 111–126 (2013)
6. Wold, S., Kettaneh-Wold, N., Skagerberg, B.: Nonlinear PLS modelling. Chemometr. Intell.
Lab. Syst. 7, 53–65 (1989)
7. Eriksson, L., Byrne, T., Johansson, E., Trygg, J., Vikstrom, C.: Multi - and megavariate data
analysis, basic principal and applications, 3rd edn. Umetrics Academy (2013)
8. Myers, R.H., Montgomery, D.C., Anderson-Cook, C.M.: Response surface methodology:
process and product optimization using designed experiments, 4th edn. Wiley, New York
(2016)
9. Eriksson, L., Johansson, E., Kettaneh-Wold, N., Wikström, C., Wold, S.: Design of
Experiments: Principles and Applications. Umetrics AB, Umeå Learnways AB, Stockholm
(2000)
10. Geladi, P., Kowaiski, B.R.: Partial least-squares regression: a tutorial. Anal. Chim. Acta 185,
1–17 (1986)
11. Wold, H.: Path Models with Latent Variables: The NIPALS Approach. Academic Press,
New York (1975)
12. Hoskuldsson, A.: Prediction Methods in Science and Technology. Thor Publishing,
Denmark (1996)
13. Li, B., Morris, A.J., Martin, E.B.: Generalized partial least squares regression based on the
penalized minimum norm projection. Chemom. Intell. Lab. Syst. 72, 21–26 (2004)
14. Kettaneh-Wold, N.: Analysis of mixture data with partial least squares. Chemometr. Intell.
Lab. Syst. 14, 57–69 (1992)
15. Assallay, A.M., Rogers, C.F., Smalley, I.J., Jefferson, I.F.: Silt. Earth-Sci. Rev. 45, 20–30
(1998)
16. Imanzadeh, S., Hibouche, A., Jarno, A., Taibi, S.: Formulating and optimizing the
compressive strength of a raw earth concrete by mixture design. Constr. Build. Mater. 163,
149–159 (2018)
17. Eid, J.: New construction material based on raw earth: cracking mechanisms, corrosion
phenomena and physico-chemical interactions. Eur. J. Environ. Civ. Eng. 8189, 1–16 (2017)
18. Box, G.E., Stuart Hunter, J., Hunter, W.G.: Statistics for experimenters: design, innovation,
and discovery, 2nd edn. Wiley (2005)
19. Fisher, R.A.: The Design of Experiments. Hafner, Libraries Australia, New York (1971)
20. Goupy, J.: Plans d’expériences : les mélanges. DUNOD, Paris (2001)
Databases Coupling for Morphed-Mesh
Simulations and Application on Fan
Optimal Design
Zebin Zhang1(B) , Martin Buisson2 , Pascal Ferrand2 , and Manuel Henner3

1
Zhengzhou University, Zhengzhou 450001, China
zebin.zhang@zzu.edu.cn
2
Ecole Centrale de Lyon, 69134 Ecully, France
{martin.buisson,pascal.ferrand}@ec-lyon.fr
3
Valeo Thermal Systems, 78320 La Verrière, France
manuel.henner@valeo.com
Abstract. Aerodynamic databases collected either by experimental or

numerical approaches are relatively “local” in a large-scale design space,
surrounding the reference configurations or operating conditions. How-
ever, the exploration of the design space requires knowledge of the “dark”
space where few data is available. Therefore, the coupling of “remote”
databases is necessary. Databases had been generated by performing
CFD (Computational Fluid Dynamics) simulations with meshes mor-
phed from different geometrical configurations. Then an ordinary least
square method was used to obtain derivatives out of databases. Direct
co-Kriging method was used to interpolate those derivative-integrated
databases. Derivability studies were carried out on two main sub-models:
regression model and correlation model. Appropriate models were pro-
posed respectively. Referring to 2 geometries and 2 operating conditions,
4 second order integrated databases had been generated for an auto-
motive engine cooling fan. Progressively database coupling shows the
advantage of the proposed approach. Optimizations has been done to
improve the fan performances at different operating conditions.
Keywords: Database coupling · Co-Kriging · Optimal design · CFD
1 Introduction
With the improvement of measurement facilities such as high-performance
computing, data collected becomes more and more complex and informative,
characterized by increasing dimensionality and larger sample size [1,5], seriously
challenging our ability to keep pace with the need to precisely model the systems
we seek to design.
In the design optimization, it is a common practice to extract design infor-
mation through a Design of Experiments (DoE) based modeling process [4,14].
Supported by NSFC No.51575498.
https://doi.org/10.1007/978-3-030-21803-4_97
982 Z. Zhang et al.
However, the number of samples grows dramatically with the dimensions and
the size of the design space. Usually it is too computational costly to obtain
enough information for the entire design space. Consequently, optimization are
restricted on the neighborhoods of some reference configurations or operating
conditions. Optimization results are often some incrementally “improved” ones
instead of the global optima. Nevertheless, data used for one optimization can be
relevant for another, both of them can be used for the exploration of the design
space [18].
Inspired from the idea of data clustering, this work draws attention to the
coupling of existing databases which have been used for different optimizations.
Links between these disperse data can be approximated, which helps to capture
the global properties of the design space. Different from the DoE-based mod-
eling approaches, only a few databases are used, each of them relies on either
a new geometrical configuration or a new operating condition. The coupling of
databases is performed at 2 steps:
(1) Data smoothing: Data is collected from numbers of discrete measurements,
which are not arbitrary distributed but roughly centered at a reference
configuration in the design space. A smoothing process is done to collect
information out of the database, so that continuous extrapolation can be
performed out of the reference configuration;
(2) Databases coupling: Extrapolation out of one single database suffers from
truncation error, effective covered region by one database is limited to a
certain radius. For large-scale design space with disperse reference configu-
rations, the coupling of several databases is necessary.
The first operation basically creates a “local” model with a limited acting-range,
a mesh-morphing technique is employed. The second step makes uses of the
“local” models to establish a global one by using direct co-Kriging method.
This paper begins with a description of the generation of database by using
mesh-morphing technique, then direct co-Kriging method is used to couple the
databases. This methodology is employed for the aerodynamic shape optimiza-
tion of an automotive engine cooling fan that has been optimized in a previous
study [22].
2 Database Generation
For simulation based aerodynamic shape optimization, design parameters can
be categorized into geometrical parameter and physics parameter. Variation of
the geometrical parameter results in different geometrical configurations, which
requires a remeshing process. Variation of physics parameter results in different
operating conditions, which can be simply done through modification of the
boundary condition.
A mesh deformation method is applied for the variation of geometrical param-
eter. It relies on a morphing technique which calculates the mesh (nodes) dis-
placements with a Radial Basis Function (RBF) approach [15,16]. The mesh
Databases Coupling for Morphed-Mesh Simulations and Application 983
deformation is a RBF type propagation out of the displacements of some user-

defined control points which are usually positioned on the boundaries of compu-
tational domain. Such type deformation is illustrated on a 2-D surface mesh in
Fig. 1a, where the dots refer to the control points. Displacements are attributed
to the inner control points, the outer ones are fixed deliberately. The displace-
ments of mesh nodes are calculated following a radial basis function defined a
priori. This technique is employed to obtain deformed mesh of an automotive
engine cooling fan blade (Fig. 1b).
Fig. 1. (a). Mesh morphing driven by control points (Ref. [16]) (b). An illustration of
surface mesh deformation
One single reference mesh is used for one geometrical configuration. Deriva-
tives can be deduced from the results of some new CFD simulations based on
deformed meshes. Compared with the classical parametrization method which
parametrizes the geometrical configuration directly, followed by a re-meshing
process for each new configuration, the mesh deformation approach deals with
one single reference mesh, which can more-likely conserves the total mesh element
number and assures the similar discretization of the computational domains.
Figure 1b compares the original surface mesh with a deformed mesh following
a variation of sweep angle. The variation of parameters follows a selected finite
difference scheme as shown, where the round point in the center represents the
reference mesh, and the diamond shape points represent the deformed meshes.
CFD simulations are performed based on the morphed meshes and modified
boundary conditions. Simulations results are used to calculate the derivatives.
The derivative calculation follows the same ordinary least square method
as used in the Ref. [22]: first, data is normalized and variations are calculated;
second, second order polynomial regression is applied to calculate the first order
and second order derivatives; finally, not only the simulation results, but also the
previously calculated derivatives are used to calculate the cross derivatives. By
doing this, data used to calculate the derivatives are reused to calculate the cross
derivatives which can improve the accuracy. Notice that points are allocated in
this study with a given finite difference scheme, which is not obligatory for the
984 Z. Zhang et al.
calculation of derivatives. The same regression method can be applied for a DoE
style distribution.
Once the derivative database is computed, a model can be built for each
objective using a multi-parameter high-order Taylor-series expansion:
F (n)
F (P + ΔP ) = F (P ) + F (1) ΔP + . . . + ΔP n + o(ΔP n+1 ) (1)
n!
where F (P + ΔP ) is the objective reconstructed in term of a variation ΔP of
the parameter P . In Eq. (1), the truncation error is of the magnitude of ΔP n+1 .
In the previous work [22], one simple polynomial model is used directly for the
optimization. In this work, several models are created and coupled by using the
co-Kriging method presented in the next section.
3 Design Space Exploration with Assistant Information
Much work has been done to improve the model reliability, either from local or
global points of view [3,10]. These researches follows two main paths. First, by
applying different training strategies, one tries to improve the local precision of
the model, especially that of the optima neighborhoods, so called “design space
exploitation”. This is particularly useful, and at times indispensable for large-
scale optimization. Second, introducing assistant information such as deriva-
tives to the original dataset which serves to build the surrogate model. The
most typical development concerns the integration of the adjoint method and
the gradient-enhanced co-Kriging model in the aerodynamic shape optimiza-
tion [6]. One remarkable advantage of adjoint method lies in the independence
of the number of design variables, making this approach an appropriate choice for
high-dimensional problems. However, the gradients obtained from this method
depend on some given objective functions because the reversal mode differenti-
ation is used, consequently, adding new objectives requires a new adjoint-state
evaluation. Furthermore, for the sake of reliability of the surrogate model, despite
the gradients can improve the model accuracy, it is not sufficient for the mod-
eling of the multimodal objective functions with limited number of evaluations.
High-order derivatives, if applicable, should be naturally considered [9,17]. The
databases used in this study contain derivatives up to second order for the pur-
pose of illustration [21], database with higher order is also applicable.
The Kriging method models an unknown function by a stochastic process [7,
8,13], represented by a mean function and a covariance function. It is often the
best linear unbiased prediction from some given evaluations, i.e., the prediction
error is minimized. Two sub-models are needed for this stochastic process:
(1) A regression model which can be considered as the mean function of all the
possible prediction functions subject to the existing evaluations;
(2) A correlation model which reflects the spatial correlations between the points
of the design site. The Gaussian model is the most-frequently used, which
assumes a priori the Weakly Sense Stationary(WSS) of the design site.
Adapted models should be applied to realize the interpolation of deriva-

tive integrated data. Both sub-models are analyzed and appropriate models are
selected respectively.
3.1 Regression Model
For the regression model, many works presume a constant average of the Kriging
process [12,20]. For a co-Kriging process [9,19], the differentiation of the regres-
sion model is required, once differentiated, a constant regression model cannot
reflect the differences of high order terms at different design points. Zhao has
assessed different regression models such as Hermite polynomial, the trigonom-
etry function and the exponential function and so on [24]. According to his
research, the polynomial function is recommended for the universal Kriging and
is therefore taken for this study. The regression matrix for a co-Kriging model
consists of a regression part and a differentiated part which takes the following
form: F(x) = [F0 d(F) d2 (F)]T , where d(F) and d2 (F) are respectively the
first order and second order derivatives of the regression matrix F. If higher order
database is concerned, the corresponding d3 (F), . . . dn (F) should be added after
d2 (F). Clearly from this representation, if the regression model is of order less
than that of the given database, the higher order part of the regression matrix
will all be zeros. In this case, the “enhancing” effect of the high order deriva-
tives will not be fully extracted. As the database is integrated with second order
derivatives, second order polynomial function is chosen for this study.
3.2 Correlation Model
Kriging method assumes that the regression function is a mean path of a ran-
dom process, a careful choice of law of probability for this process is essential.
Generally, the laws of probability are unknown for a n dimensional design space.
In order to characterize the covariogram in the random field, it is necessary to
do an estimation from the existing samples following a given correlation model.
Commonly used correlation models are exponential model and Gaussian
model which assume an a priori known covariogram style. The interpolation
properties primarily depend on the local behavior of the random field. Near to
the origin, the exponential model behaves linearly and the Gaussian model shows
a quadratic behavior. For the sake of derivability, the latter is chosen for current
study, the correlation between 2 samples can be expressed as:
n

R(θ, skj , slj ) = exp(−θj d2j ) (2)
j=1
where dj = skj − slj is the Euclidean distance between 2 samples sk and sl at j th

dimension. Where j ∈ [1, n] denotes the dimension index, k, l ∈ [1, m] are the
index of the reference point.
986 Z. Zhang et al.
The correlation model can be established by evaluating a hyper-parameter

namely θ [11], which can be obtained by the minimizing the following function:
1
ϕ(σ, R) = det(R) m ∗ σ(θ)2 (3)
where m is the number of points, σ is the standard deviation of the stochastic

process of the samples, R is the correlation matrix. Genetic algorithm has been
used to find the single-objective optimum of this function.
The correlation between any reference point s and all the rest reference points
is given by:
cov(si , skj ) = σ 2 R(si , skj ) (4)
where i, j ∈ [1, n] denote the dimension indices, σ 2 is the variance of the sample.
An example of the correlation between the first order derivative of ith dimen-
sion for k th point and the first order derivative of j th dimension for lth point is
given:
∂S ∂S ∂ 2 R(si , sj )
cov( k , l ) = σ 2 (5)
∂si ∂sj ∂ski ∂slj
Once the regression matrix F and the correlation matrix R are calculated,
the co-Kriging model ŷ(x) can be deduced:
ŷ(x) = f (x)T β + r(x)T R−1 (Y − F β), (6)
where r(x) is the correlation vector between any design to be predicted and
existing reference points, Y is the response matrix, regression coefficient vector
β can be calculated by a least square estimation which gives:
β ∗ = (F T R−1 F )−1 F T R−1 Y (7)
Variance σ 2 is given by:

1
σ2 = (Y − F β)T R−1 (Y − F β), (8)
n
All the objective functions can be modeled by this approach. Model functions can
be exploited by an optimization algorithm to find the optima. This methodology
is applied to the optimal design of an automotive engine cooling fan.
4 Multi-condition Optimal Design of Engine Cooling Fan

Fan systems are used in automotive engine cooling module to increase air flow
rate through stacking of different heat exchangers. The operating characteris-
tics of a fan system can be determined by the nominal operating point and
the zero-pressure rise point (transparent point). One major task of fan system
development is to improve the efficiency at the nominal operating point, and to
“push” as far as possible the transparent point to a higher flow rate, so that the
fan will not generate additional drag (negative Δp) when a vehicle is running at
high speed.
The pressure rise Δp from upstream to downstream of the fan wheel, the
torque T acting on the fan and the aerodynamic efficiency η are considered
as objectives to be optimized. The first study has been performed with only
one database [22]. With one database, the covered design space is limited. The
complexity of the characteristic curves and the transparent point prediction drive
us to couple several databases.
The database coupling will exploit the most influent parameters, also it will
show the capacity of exploration for the dimension with only one single value,
which is not possible for a classical Gaussian process based co-Kriging method.
For the sake of database coupling, where only a few databases are available,
adaptation of the co-Kriging method has been studied and implemented [23].
Stagger angles at mid-span and at tip of the fan blade, air flow rate were taken
as parameters to be exploited. Having 2 geometrical configurations, functioning
at 2 operating points, will result in 4 databases. The coupling of all the databases
allows performances evaluating at different flow rates.
(a) Schema of database coupling (b) Pareto Front
Fig. 2. Schema of database coupling and optimization results presented on the Pareto
front
Figure 2a illustrates the schema of coupling of different databases, where

A, B, E and F are polynomial models by using Taylor series expansion Eq. (1),
represented by 4 circles. Model C couples the 2 databases corresponding to model
A and model B, represented by the lower-side rectangular. Model G couples the 2
databases corresponding to model E and model F, represented by the upper side
rectangular. Finally, model D is built by coupling all the 4 databases, represented
by the big rectangular. The reference values are 2300 m3 /h and 2600 m3 /h for
the flow rate, 0◦ and 3◦ for the stagger angle at mid-span, the stagger angle
988 Z. Zhang et al.
at the tip takes its reference relative value 0◦ , a third axis can be imagined
perpendicular to the two axis shown. Although with only one reference value,
the dimension of tip stagger angle γt is explored.
Model D, which couples 4 different databases, allows us to obtain more reli-
able results on a larger range of parameter variation. Hence this model is used
to exploit the optima according to different criteria.
Based on model D, a multi-objective optimization has been performed to
pursue an optimal design in term of performances at 2 different operating con-
ditions: Qn = 2300 m3 /h and Qi = 2800 m3 /h. At nominal operating point
2300 m3 /h, a higher efficiency is wanted, for 2800 m3 /h, a higher pressure rise
Δp is preferred in order to have an extended range of flow rate.
Two objectives: efficiency at Qn = 2300 m3 /h, namely ηn and pressure rise
at Qi = 2800 m3 /h, namely Δpi , have been considered. The algorithm NSGA-
II [2] has been employed with 5000 individuals and 100 generations due to the
inexpensive model based evaluations. For each point, 2 performance evaluations
have been done thanks to the model D, one evaluation at 2300 m3 /h and the
other at 2800 m3 /h. The bi-objective Pareto front is illustrated (Fig. 2b).
In Fig. 2b, the initial individuals, marked as red points, form a 2-dimensional
projection on the objective surface ηn − Δpi . The frontiers are clearly depicted,
where the Pareto front can be seen on the top-right part, marked with black
dots, formed of 502 survived individuals.
In order to illustrate the possible exploitation with the coupled model, 2
optima have been adopted on the Pareto front, one favors the transparent point
(optimB) and the other valorizes the efficiency of nominal condition (optimC).
Optimization results are compared with the reference configuration “Ref” in
Table 1.
Table 1. Result of multi-objective optimization at 2 operating conditions
Geometry γt◦ ◦
γm Q (m3 /h) Δp (Pa) C (Nm) η (%)
Ref 0 0 2300 226.4 0.9073 54.36
Ref 0 0 2800 184.8 0.9019 54.23
OptimB −2.88 0.04 2300 242.0 0.9689 54.36
OptimB −2.88 0.04 2800 199.8 0.9716 54.39
OptimC −2.65 0.99 2300 232.1 0.9163 55.20
OptimC −2.65 0.99 2800 184.8 0.9074 53.87
In Table 1, the numbers in italic are those values of objectives concerned in

this optimization. By keeping the same efficiency with the reference geometry,
OptimB manages to raise the pressure rise at 2800 m3 /h by 15 Pa, or 8.1%. And
if the pressure rise at 2800 m3 /h is kept unchanged, the efficiency at 2300 m3 /h
can be improved by 0.84, or 1.5% higher than the reference configuration. These
improvements are analyzed according to the flow structure modifications.
CFD simulations have been performed to validate the optimizations, the

maximum relative error is 1.3%, found for the Δp of optimC at 2300 m3 /h, an
error within the numerical uncertainty. The approach validated for this multi-
objective optimization.
Conclusion
By using a mesh-morphing technique, aerodynamic databases are generated at
different locations in the design space, each of them is centered by one reference
geometrical configuration or physics condition that has been previously used for
optimal design. Databases are analyzed and some useful regression coefficients
are collected through a ordinary least square method. The co-Kriging method has
been implemented to couple the databases. A second order polynomial regression
and a Gaussian correlation model have been employed for the model. Adapta-
tion of the classical co-Kriging method has been done to make it work for the
dimension with one-single reference value.
The proposed approach is applied to the optimal design of an engine cooling
fan. 4 databases have been obtained, corresponding to 4 polynomial model. 3
coupled models were created based on these databases, in which the model D,
being the most reliable one, is used for a model based optimization. One of the
results showed obvious improvements on both the aerodynamic efficiency and
the torque for an engine cooling fan. A multi-objective optimum succeeded in
enlarging the operating range of the fan, the other managed to keep the same
range and improve the efficiency at nominal condition.
The approach can be possibly coupled with the sensitivity equation meth-
ods, where the Navier-Stokes equations are implicitly differentiated to obtain
derivatives in a much economical way.
References
1. Constantine, P.G.: Active Subspaces: Emerging Ideas for Dimension Reduction in
Parameter Studies, vol. 2. SIAM, Philadelphia, PA (2015)
2. Deb, K., Agrawal, S., Pratap, A., Meyarivan, T.: A fast elitist non-dominated
sorting genetic algorithm for multi-objective optimization: NSGA-II. Lecture Notes
in Computer Science, vol. 1917. Springer, Berlin, Heidelberg (2000)
3. Forrester, A., Keane, A., Bresslo,NW.: Design and analysis of “noisy” computer
experiments[J]. AIAA J. 44(10), 2331–2339 (2012)
4. Wang, G., Shan, S.: Review of metamodeling techniques in support of engineering
design optimization. ASME J. Mech. Des. 129(4), 370–380 (2007)
5. Giraldo, R., Dabo-Niang, S.: Statistical modeling of spatial big data: an approach
from a functional data analysis perspective. Stat. Prob. Lett. (2018) (in press)
6. Han, Z., Zimmerman, R., Görtz, S.: Alternative cokriging model for variable-fidelity
surrogate modeling. AIAA J. 50(5), 1205–1210 (2012)
7. Jones, D.R.: A taxonomy of global optimization methods based on response
surfaces. J. Global Optim. 21(4), 345–383 (2001). https://doi.org/10.1023/A:
1012771025575
990 Z. Zhang et al.
8. Krige, D.: Statistical approach to some mine valuations and allied problems at the
witwatersrand. Master’s thesis, University of Witwatersrand (1951)
9. Laurenceau, J., Meaux, M., Montagnac, M., Sagaut, P.: Comparison of gradient-
based and gradient-enhanced response-surface-based optimizers. AIAA J. 48(5),
981–994 (2010)
10. Leifsson, L., Koziel, S., Tesfahunegn, Y.A.: Multiobjective aerodynamic optimiza-
tion by variable-fidelity models and response surface surrogates. AIAA J. 54(2),
531–541 (2016)
11. Lophaven, S.N.: Aspects of the matlab toolbox dace. Technical Report, University
of Denmark (2002)
12. March, A., Willcox, K.: Provably convergent multifidelity optimization algorithm
not requiring high-fidelity derivatives. AIAA J. 50(5), 1079–1089 (2012)
13. Matheron, G.: Principles of geostatistics. Econ. Geol. 58, 1246–1266 (1963)
14. Probst, D.M., Senecal, P.K.: Optimization and uncertainty analysis of a diesel
engine operating point using computational fluid dynamics. ASME 2016 Internal
Combustion Engine Division Fall Technical Conference, Greenville, South Carolina,
USA (2016)
15. Rendall, T.C.S., Allen, C.B.: Unified fluid-structure interpolation and mesh motion
using radial basis functions. Int. J. Numer. Methods Eng. 74, 1519–1559 (2014)
16. Rozenberg, Y., Benefice, G., Aubert, S.: Fluid structure interaction problems in
turbomachinery using rbf interpolation and greedy algorithm. In: ASME Turbo
Expo 2014: Turbine Technical Conference and Exposition, vol. 16, no. 1, p. 102
(2014)
17. Rumpfkeil, M.P.: Optimizations under uncertainty using gradients, hessians, and
surrogate models. AIAA J. 51(2), 444–451 (2013)
18. Schnoes, M., Nicke, E.: A database of optimal airfoils for axial compressor through-
flow design. ASME J. Turbomach. 139(5) (2017)
19. Villemonteix, J., Vazquez, E., Walter, E.: An informational approach to the global
optimization of expensive-to-evaluate functions. J. Global Optim. 44(4), 509–534
(2008)
20. Yamazaki, W., Mavriplis, D.J.: Derivative-enhanced variable fidelity surrogate
modeling for aerodynamic functions. AIAA J. 51(1), 126–137 (2013)
21. Zhang, Z., Demory, B.: Space infill study of kriging meta-model for multi-objective
optimization of an engine cooling fan. Turbine Technical Conference and Exposi-
tion. In: Proceedings of ASME Turbo Expo 2014, Dusseldorf, Germany (2014)
22. Zhang, Z., Buisson, M., Ferrand, P.: Meta-model based optimization of a large
diameter semi-radial conical hub engine cooling fan. Mech. Ind. 16(1), 102 (2015)
23. Zhang, Z., Han, Z., Ferrand, P.: High anisotropy space exploration with co-kriging
method. Global Optimization Workshops 2018 (LeGO). Leiden, Netherlands (2018)
24. Zhao, L., Choi, K.K., Lee, I.: Metamodeling method using dynamic kriging for
design optimization. AIAA J. 49(9), 2034–2046 (2011)
Kriging-Based Reliability-Based Design
Optimization Using Single Loop
Approach
Hongbo Zhang(B) , Younes Aoues, Hao Bai, Didier Lemosse,

Normandie Univ, INSA Rouen Normandie, LMN, Rouen, France

hongbo.zhang@insa-rouen.fr
Abstract. Reliability-Based Design Optimization (RBDO) is a pow-

erful tool in engineering structural design, it tries to find a balance
between cost and reliability for structural designs under uncertainty.
Several RBDO formulations are developed to solve the RBDO problem,
such as double loop methods, single loop methods, and decoupled meth-
ods. Despite, these new formulations of RBDO, they are unable to deal
for engineering complex problems, due to the computational cost. The
Kriging surrogate has been widely used to replace the time-consuming
mechanical constraints. In this paper, a single loop RBDO approach
(SLA) is coupled with the Kriging surrogate, the most probable points
(MPP) of each loop are used as new sample points to update the Kriging
model. The Kriging-SLA is running iteratively until it reaches the con-
verge criteria. Compared with other sampling methods, this method can
be started with very few training points and converges to the right min-
imum very efficiently. 2 benchmark examples are used to demonstrate
this method.
Keywords: Reliability-based optimization · Single loop approach ·

Kriging surrogate · Adaptive sampling
1 Introduction
In reliability-based design optimization (RBDO), the objective is to find the

best set of parameters that minimize the structural cost or cost function under
probabilistic constraints. A typical RBDO problem can be described as:
Find deterministic variables d and the mean values μx that:
min f (d, μX )

P rob[Gi (X, d)] ≤ Ffti (1)
st. i = 1, 2, . . . m
d ≤ d ≤ dU , μL ≤ μ ≤ μU
L

https://doi.org/10.1007/978-3-030-21803-4_98
992 H. Zhang et al.
where f (d, μ) is the cost function or objective function,Gi (X, d) is the ith limit
state function, P rob[Gi (X, d)] is its probability of failure and Ffti is the ith
target probability of failure, m is the number of limit state functions.
To solve the RBDO problem, many algorithms have been proposed and can
be summarized as double loop methods, single loop methods, and decoupled
methods [1].
Double-loop methods aim to solve the RBDO problem in two loops, the outer
loop tries to solve the optimization problem by changing the design variables,
while the inner loop solves the reliability constraints. These methods include
the simple Monte Carlo simulation(MCS), which is straightforward, but it needs
large sample sets and becomes prohibitive when the probability of failure is
low [9]. Approximation methods have been proposed to approximate the prob-
ability of failure. Enevoldsen and Sørensen (1994) have proposed the Reliability
Index approach(RIA) [6], Tu and Choi (1999) has proposed the performance
measure approach(PMA) [15], which has proven to be more robust and efficient
in evaluating inactive probabilistic constraints. These two first-order reliability
methods (FORM) are easy to implement but are time-consuming for complex
constraints, because for each time the design variables are changed, the inner
loop must calculate the reliability constraints iteratively, which will become very
computing expensive for complex engineering problems.
To reduce the computational cost of the double loop approach, single
loop approach methods and decoupled approaches have been proposed. Mad-
sen and Hansen (1992) have proposed a method based on the Karush-Kuhn-
Tucker(KKT) optimality conditions [12], where the RBDO problems are trans-
formed into KKT optimality conditions. Liang et al. (1997) have based on KKT
method, and have further developed a Single Loop Approach(SLA) [10], in SLA,
the nested RBDO problem is transformed into equivalent deterministic single-
loop processes. Du and Chen (2004) have proposed the Sequential Optimization
and Reliability Assessment(SORA) method [4], Cheng et al. (2006) have pro-
posed a Sequential Approximate programming(SAP) method [2], these meth-
ods all try to separate the reliability analysis from the optimization loop and
transform the RBDO problem into deterministic optimization loops to improve
efficiency.
For complex engineering problems, metamodels are widely used to substitute
complex reliability constraints. Ju and Lee (2008) have used Kriging metamodel
and moment method to solve RBDO problem [3], Lee and Jung (2008) have pro-
posed a constraint boundary sampling(CBS) method, that adds more training
points on the limit state functions and has used MCS to solve the reliability
problem [9]. Chen et al. (2014) have proposed a local adaptive sampling(LAS)
method, that will add points around current design points to update the Kriging
metamodel, and have used FORM to perform the reliability analysis [1]. Dubourg
and Sudret (2013) have used important sampling(IS) method to build the Krig-
ing model and have used MCS to perform the reliability analysis [5]. Zhuang
and Pan (2012) have proposed a sequential sampling for Kriging using PMA
method and add samples using expected relative improvement(ERI) criterion,
Kriging-Based Reliability-Based Design Optimization Using 993
which focuses more points on current most probable point(MPP) [11]. Echard
et al. (2011) has proposed an active learning method by combining MCS and
Kriging, which used expected feasibility function(EFF) to find the best points
to update the surrogate [16].
These sampling methods separate the processes of training the Kriging meta-
models and the reliability analysis, they use double loop methods to solve RBDO
problems. To further improve the efficiency of Kriging-based RBDO, this paper
aims to use the Kriging metamodel and the Single Loop Approach (SLA). The
Kriging metamodel is updated by using the Most Probable Points (MPPs) cal-
culated at each iteration of SLA.
The paper is structured as follows, in the first part, previous work of RBDO
is discussed, in part 2, the theory of Kriging metamodel is briefly introduced,
in part 3 the Kriging-SLA method is introduced, in part 4, two well-known
benchmark problems are used to demonstrate the method. The last part is the
conclusion.
2 Theory of Kriging Metamodel

Kriging is a surrogate model based on regression using observed data sets [14].
Kriging tries to give a Gaussian estimation of an unmeasured location as well
as an estimate square error (MSE) of this point. The Kriging model can be
described as:
k

Y (x) = βi hi (x) + Z(x) (2)
i=1
where Y (x) is the estimation at input x which is assumed to be a regression

model, it’s a linear combination of base function hi (x) and their coefficients βi
plus a stochastic process Z(x), where Z(x) is assumed to be zero mean and the
covariance between two points from Z(xi ) and Z(xj ) is defined as:
Cov(Z(xi ), Z(xj )) = σz2 Rz (θ, xi , xj) (3)
where θ is the parameter to be defined, σz2 is the variance of Z, Rz is the cor-

relation function that can adopt different forms, for example, linear correlation,
exponent correlation, cubic correlation, etc. The Gaussian correlation is very
commonly used, which is defined as [13]:
n

Rz (θ, xi , xj ) = exp[− θj (xi − xj )2 ] (4)
j=1
To determine the value of θ, βi and σz2 from observed data set [x, Y (x)],
maximum likelihood estimation (MLE) can be used. The likelihood function of
Eq. 2 is expressed as [7]:
n 1 1
L(θ, βi , σz2 |x, Y (x)) = − ln(2πσ2 ) − ln(|R|) − 2 (Y − F β)T R−1 (Y − F β)
2 2 2σ
(5)
994 H. Zhang et al.
where R is n × n correlation matrix with its elements Rz (θ, xi , xj ), F is n × 1

regression matrix, with its elements hi (x). The maximum likelihood estimates
of βi and σz2 by taking the derivative equation with respect to βi and σz2 , thus:
β̂ = (F T R−1 F )−1 F T R−1 Y (6)

1
(Y − F β̂)T R−1 (Y − F β̂)
σ̂ 2 = (7)
n
The post Bayesian MLE kriging predictor at x is given as:
Ŷ (x) = h(x)T β̂ + Ψ(x)T R−1 (Y − F β̂) (8)
where Ψ(x)T is the correlation vector between observed value and the new
prediction.
The derivative ∇Ŷ (x) of the prediction Ŷ (x) can be easily calculated from
Ŷ (x):
∂ Ŷ (x) ∂h(x)T Ψ(x)T −1
= β̂ + R (Y − F β̂) (9)
∂x ∂x ∂x
where
∂ Ŷ (x) ∂ Ŷ (x) ∂ Ŷ (x) ∂ Ŷ (x)
= , ,... (10)
∂x ∂x(1) ∂x(2) ∂x(n)
3 Kriging-Based Single Loop Approach (Kriging-SLA)

Single loop approach(SLA) converts the probabilistic optimization problem into
equivalent deterministic optimization by enforcing the Karush-Kuhn-Tucker con-
ditions of performance measure approach (PMA) [10]. The PMA method can be
summarized as:
min f (d, μX )

Gi (X, d) ≥ 0 (11)
st. i = 1, 2, . . . n
dL ≤ d ≤ dU , μL ≤ μ ≤ μU
The performance of Gi (X, d) is calculated with the inverse first-order relia-

bility problem as:
min Gi (U )
(12)
st.||U || = βit
where Gi (U ) is the ith RBDO constraint in the normal space U , βit is the
target reliability index. At the optimal point (MPP), adopting KKT optimality
condition of Eq. 12:
∇G(U ) + λ∇H(U ) = 0 (13)
where ∇H(U ) = ||U ||−β t , is the equality constraint of PMA, λ is the Lagrangian
multiplier.
Considering ∇H(U ) = 2U , and relation Eq. 13 yields:
U = −[||∇G(U )||/(2λ)](∇G(U )/||∇G(U )|| (14)
Considering the length of vector of U in the normal space is equal to βit ,

Eq. 14 is transformed as:
U = −β t α (15)
where α are the normalized gradient of the constraints. Finally, the formula of
SLA can be summarized as:
min f (dk , μkX )
⎧
⎪ Gi (dk , X k ) ≥ 0
⎪
⎪
⎨ Xi = μi − αi σX βit
(k) (k) (k)
(16)
st. (k) σ ∇ G (d ,μ (k) (k−1)
) i = 1, 2, . . . m
⎪
⎪ αi = X X i (k) i(k−1)
⎪
⎩ ||σ X ∇X Gi (d ,μ i )||
d ≤ d ≤ d ,μ ≤ μ ≤ μ
L U L U
(k)
where ui is the random design variables in the normalized space, d(k) is the
deterministic design variables, Gi (dk , X k ) is the ith constraint, βit is the target
(k)
reliability index for the ith constraint, αi is the normalized gradient vector of
(k)
the ith constraint. The SLA is run iteratively, until convergence. f (d(k) , μX )
is minimized under the deterministic constraints Gi (d(k) , X (k) ) ≥ 0. In each
loop of the SLA, MPPs for each constraint are calculated, these MPPs are then
used to update the Kriging surrogate. The program will move the design value
iteratively until convergence criteria are reached. The flowchart of Kriging-SLA
can be summarized as:
Step 1. A design of experiment defined by N samples (sampled by Latin hyper-
cube Sampling(LHS) method) as [x1 , x1 , . . . , xm , ] ∈ X and their limit state
function evaluation [Gi (x1 ), Gi (x2 ), . . . , Gi (xN )], i ∈ 1, 2, . . . m, are used to
train the first Kriging surrogate, N is the number of training points.
Step 2. Start the SLA loop, from k = 0, dk is the vector set of variables, lb,
hb are the vector of the lower bounds and the upper bounds of μkx , μkx is the
vector of the mean values of the design variables X, σX is the vector of the
standard deviation, βit is the target reliability index of the ith constraint.
(k)
Step 3. k = k +1, the vector of normalized gradient vector αi and the current
k
most probable points (MPPs) Xi of each constraint are calculated by using
(k)
the derivatives ∇μ Ĝi (d(k) , Xi ). These derivatives are calculated from the
Kriging surrogate.
(k)
Step 4. Calculate the true response of current MPPs Gi (Xi ), and add the
new points and the response to the data set of Kriging.
Step 5. min f (d, μX ), under SLA constraints Gi (dk , X k ) ≥ 0, and calculate
(k)
new d(k) and μX .
(k) (k−1)
Step 6. Compare d(k) , μX with d(k−1) ,μX , if ||d(k) − d(k−1) || ≤ and
||μ(k) − μ(k−1) || ≤ , stop; else go to step 2 and continue.
996 H. Zhang et al.
The number N of starting data set for building the surrogate can be very
small, because, though the first surrogate may failed to capture the main char-
acteristic of the constraints. The SLA may converge to an infeasible point, new
points will be added to the surrogate which are the best points to update the
Kriging around the MPPs, until the surrogate is better and better around the
optimum.
Three benchmark examples are given below and used to validate the pro-
posed method Kriging-SLA, the results are compared with reference with other
authors.
4 Examples and Results

4.1 First Mathematical Example
This is a very well-known benchmark problem [8]. x is the realization of random

variable X, composed of two independent normal variables, their standard devi-
ation are σ = 0.1. The objective function and constraints are given below. The
target reliability index for both constraints are set the same as β1t = β2t = 2.
min f (μX ) = (μ1 − 3.7)2 + (μ2 − 4)2

G1 (x) = x1 + x2 − 3
G2 (x) = −x1 sin(4x1 ) − 1.1x2 sin(2x2 )
The optimization results are shown in Fig. 1.

In Fig. 1(a), only 4 points using grid sampling method are used to train the
first Kriging. The true constraints are marked as dash lines, while the Kriging
predictions are marked as solid lines. It can be seen that the first Kriging has a
comparingly better prediction of the linear constraint G1 (x), but has a very poor
prediction of the nonlinear constraint G2 (x)(which can’t be seen in Fig. 1(a)).
In Fig. 1(b), after the first iteration, 2 MPP points and correspondent G1 (x)
and G2 (x), are added in the kriging model (marked with solid dot). The mean
values of the random variables μx is marked with + in the figure, μ0x is the
starting point. The first point μ1x is not in the feasible domain, which means the
SLA to converge to a wrong point with the poorly trained Kriging.
The final result is shown in Fig. 1(c), the reference optimum is marked with
. The program converges with 12 iterations, with 24 added points. The history
of μx and objective function is shown in Fig. 1(d). It is shown that, the program
has moved μx to the feasible domain gradually.
For the comparison of the Kriging-SLA method, a reference RIA result and
Chen et al. (2014)’s results [5] are used. The relative error is calculated with||fˆ−
f ||/||f ||, where fˆ is the result from Kriging surrogate, f is the Monte Carlo
result from Chen’s reference. These results are shown in Table 1.
From the comparison of Table 1, the error of Kriging-SLA is at the same level
with RBDO-RIA reference results, because SLA is based on the PMA approach.
CBS can achieve higher accuracy, but it need more training points. LAS needs
fewer training points, but it should be noted that for LAS, CBS, and SS methods,
after the Kriging surrogate is built, they still need to conduct reliability analysis
using MCS separately, if the failure rate is very low, large samples should be
drawn from Kriging surrogate to get the accurate results, which will increase
the total computation cost.
(a) First Kriging (b) First update
(c) Final result (d) Iteration history
Fig. 1. Lee and Jung’s example
4.2 Second Mathematical Example
This is another well-known benchmark example [3], where there are also two
variables, their standard deviations are same as σ = 0.3, the target reliability
index for all three constraints are set to be 3. The reference result with SLA
method is μ = [3.4391; 3.2864], the optimum objective function value is 6.7255.
998 H. Zhang et al.
The problem is given as:

min f (μX ) = μ1 + μ2
G1 (x) = −x21 x2 /20 − 1
G2 (x) = (x1 + x2 − 5)2 /30 + (x1 − x2 − 12)2 /12 − 1
G3 (x) = 80/(x21 + 8x2 + 5) − 1
Table 1. Comparison of Lee and Jung’s example
Method Design variables f (μ) Objective Total Relative

μ(opt) points error
Analytical [2.8421; 3.2320] 1.3259 \ \
RBDO-RIA [2.8162; 3.2769] 1.30402 \ 1.65%
Kriging-SLA [2.8166; 3.2759] 1.30482 28 1.59%
CBSref [2.8485; 3.2350] 1.3103 45 0.40%
ref
SS [2.8400; 3.2339] 1.3264 29 1.50%
ref
LAS [2.8408; 3.2334] 1.3259 22 1.60%
(a) First Kriging (b) First update
(c) Final result (d) Iteration history
Fig. 2. Liang’s example

The results are shown in Fig. 2.

The first Kriging is shown in Fig. 2(a), in this example, 4 points using grid
sampling method is used to train the Kriging metamodel, the first Kriging has a
poor prediction for 3 constraints. After the first iteration, 3 MPPs corresponding
to the 3 constraints are added, as shown in Fig. 2(b), this time μ1x is in the
feasible domain, so the proposed method converge faster than the first example,
with only 6 iterations, the history of μx and the objective function is shown in
Fig. 2(d).
In Table 2, the Kriging-SLA results are compared with simple SLA method
(without Kriging), the relative error for Kriging-SLA is lower than 0.03%.
Table 2. Comparison of Liang’s example
Method Design variables Objective Total Relative

f (μ) μ(opt) points error
SLA reference [3.4391; 3.2864] 6.7255 \ \
Kriging-SLA [3.4390; 3.2887] 6.7277 22 0.03%
5 Conclusion
The proposed Kriging-SLA method can well solve RBDO problems, it is very
robust and accurate. It is very suitable for engineering problems with com-
plex reliability constraints. This method needs fewer initial sample points, as
it doesn’t seek to globally fit the constraints well, it seeks to add the best points
that are currently available until the surrogate finds the accurate MPP. It’s
highly robust that can converge to right optimum with very few initial points,
even the initial sampling failed to cover the main characteristic of the constraints.
The accuracy of Kriging-SLA is in accordance with the SLA method without
Kriging.
References
1. Chen, Z., Qiu, H., Gao, L., Li, X., Li, P.: A local adaptive sampling method
for reliability-based design optimization using kriging model. Struct. Multidiscip.
Optim. 49(3), 401–416 (2014)
2. Cheng, G., Xu, L., Jiang, L.: A sequential approximate programming strategy
for reliability-based structural optimization. Comput. struct. 84(21), 1353–1367
(2006)
3. Cho, T.M., Lee, B.C.: Reliability-based design optimization using convex lineariza-
tion and sequential optimization and reliability assessment method. Struct. Saf.
33(1), 42–50 (2011)
4. Du, X., Chen, W.: Sequential optimization and reliability assessment method for
efficient probabilistic design. American Society of Mechanical Engineers (2002)
1000 H. Zhang et al.
5. Dubourg, V., Sudret, B.: Meta-model-based importance sampling for reliability

sensitivity analysis. Struct. Saf. 49, 27–36 (2014)
6. Enevoldsen, I., Sørensen, J.D.: Reliability-based optimization in structural engi-
neering. Struct. Saf. 15(3), 169–196 (1994)
7. Forrester, A., Sobester, A., Keane, A.: Engineering Design via Surrogate Modelling:
A Practical Guide. Wiley (2008)
8. Lee, I., Choi, K., Du, L., Gorsich, D.: Dimension reduction method for reliability-
based robust design optimization. Comput. Struct. 86(13–14), 1550–1562 (2008)
9. Lee, T.H., Jung, J.J.: A sampling technique enhancing accuracy and efficiency of
metamodel-based RBDO: constraint boundary sampling. Comput. Struct. 86(13–
14), 1463–1476 (2008)
10. Liang, J., Mourelatos, Z.P., Tu, J.: A single-loop method for reliability-based design
optimization. American Society of Mechanical Engineers (2004)
11. Lv, Z., Lu, Z., Wang, P.: A new learning function for kriging and its applications to
solve reliability problems in engineering. Comput. Math. Appl. 70(5), 1182–1197
(2015)
12. Madsen, H., Hansen, P.F.: A comparison of some algorithms for reliability based
structural optimization and sensitivity analysis, pp. 443–451. Springer (1992)
13. Rasmussen, C., Williams, C.: Gaussian processes for machine learning. MIT Press
(2006)
14. Sacks, J., Welch, W.J., Mitchell, T.J., Wynn, H.P.: Design and analysis of computer
experiments. Statistical science, pp. 409–423 (1989)
15. Tu, J., Choi, K.K., Park, Y.H.: A new study on reliability-based design optimiza-
tion. J. Mech. Des. 121(4), 557–564 (1999)
16. Zhang, J., Xiao, M., Gao, L.: An active learning reliability method combining
kriging constructed with exploration and exploitation of failure region and subset
simulation. Reliab. Eng. Syst. Saf. (2019)
Sensitivity Analysis of Load Application
Methods for Shell Finite Element Models
Wilson Javier Veloz Parra(B) , Younes Aoues, and Didier Lemosse
Normandie Univ, INSA Rouen Normandie, LMN, Rouen, France

wilson.veloz parra@insa-rouen.fr
Abstract. Wind turbine blades are subjected to wind pressure and iner-
tial loads from their rotational velocity and acceleration, that depends
on the external environment and the turbine control (start-up, nor-
mal energy production, shut down procedures, etc.). Several numerical
tools are developed to compute the applied loads to the wind turbine
blades. These numerical tools are generally based on multiphysics simula-
tion (aeroelasticity, aerodynamics, turbulence, etc.) and multibody beam
finite element model of the whole turbine. However, when we are inter-
ested in optimizing the structural blades, we need to use shell finite ele-
ment models in the structural analysis. Thus, the loads estimated by using
the beam element are transformed into a 3D distribution pressure loads
for the shell element. Several Load Application Methods are developed in
the literature. However, in the context of the structural reliability analysis
and optimization of the wind turbine blades, the suitable method should
be selected with respect to his sensitivity to uncertain input parameters.
This study present, a sensitivity analysis of the output of two load appli-
cation methods for shell finite element models, with respect to uncertain
input parameters as loads and material properties. The Morris method
is used to carry out a sensitivity analysis. Both load application methods
are sensitive to the change of the thickness in the materials and have a
greater effect than the distributed loads applied by section.
Keywords: Load application · Sensitivity analysis · Morris method ·

Shell finite element model
1 Introduction
Wind turbine blades are subjected to wind pressure and inertial loads from
their rotational velocity and acceleration, that depends on the external environ-
ment, electromagnetic generator torque and the turbine control (start-up, nor-
mal energy production, shut down procedures, etc.). Several numerical tools are
developed to facilitate the design of wind turbine blades, usually loads calculation
is carried out with a beam Finite Element Model (FEM), taking into account
aero-elastic behaviour, turbine control commands and also the hydrodynamic
behaviour for off-shore turbines. Some examples of these codes are the Fatigue,
Aerodynamics, Structures, and Turbulence [12] from National Renewable Energy
https://doi.org/10.1007/978-3-030-21803-4_99
1002 W. J. V. Parra et al.
Laboratory and Horizontal Axis Wind turbine simulation Code 2nd generation
[15]. These multi-physics and multi-body aero-servo-hydro-elastic beam finite
elements codes are able to run a great number of the design situations described
by certification bodies [9], taking into account all different extreme loads acting
in the structure, generating the loading history used to fatigue analysis.
The method to transform these loads from the beam finite element model to
a shell finite element model is defined as the Load Application Method (LAM)
[5]. The issue is to select the appropriate method to use in the context of the
Reliability-Based Design Optimization (RBDO) of the blades, that ensures the
convergence of the optimization and the reliability procedures and having a bal-
ance on computational time and physical considerations in the load distribution.
A sensitivity analysis using the Morris method is used to compare two LAM,
viewing the sensitivity of the output responses (displacement and stress) with
respect to uncertain input parameters. The sensitivity analysis aims to iden-
tify the most significant uncertain parameters on the variability of the output
responses and to select the appropriate LAM for the RBDO approach. In other
words, the main goal of the sensitivity analysis in the RBDO approach is to
reduce the stochastic dimension of the reliability analysis, where only the most
uncertain parameters that have a great influence on the output responses are
considered in the surrogate model, the remaining parameters were fixed to their
respective mean values.
2 Load Application Methods
To transfer the 1D load distribution from the beam FEM to a 3D load distribu-
tion to be applied to a shell FEM, Caous [5] has classified the methods reviewed
in the literature into 4 groups, as listed bellow:
– Group 1: Application of loads by sections and on one point of each section

(Fig. 1(a)).
– Group 2: Application of loads by sections but physical distribution on sections
(Fig. 1(b)).
– Group 3: Continuous application on the blade of an oriented surface load
(pressure oriented in a specific direction).
– Group 4: Dissociation of inertial and aerodynamic loads with application of
an acceleration field and pressure distribution across the whole blade.
In this article only the first two groups are studied applying a Morris sensi-
tivity analysis.
2.1 Group 1: Application of Loads by Sections (LAM-RBE)
In this first group (Fig. 1(a)), the resultant loads from the beam FEM computed
at selected nodes along the blade span are applied directly into the shell model
at selected sections (that have the same position as the beam FEM), either
Sensitivity Analysis of Load Application Methods 1003
Fig. 1. Approaches for load application in a shell FEM of the blade [5].
through a master node which controls the whole section displacement by relations
between nodes degrees of freedom (using Rigid Body Elements: RBE) [7,10] or
directly onto one node having a nearby position compared with the aerodynamic
node of the beam finite element [8].
Fig. 2. LAM used to compare: (a) RBE and (b) load distribution in four points [5].
The authors had applied directly the resultant loads from the beam
FEM to a shell model without distinction between aerodynamic and iner-
tial loads (Fig. 2(a)). External forces and moments are extracted from ten
nodes of the beam FEM and applied into the shell model as FxRBE , FyRBE ,
FzRBE ,MxRBE , MyRBE and MzRBE at each section that have the same location
as the beam element.
All the nodes of the different sections of the shell model are linked by RBE
and it makes then undeformable. This approach is mostly used to model a full-
scale tests on blades [3], for a simple and fast application of loads from the beam
to the shell FEM.
2.2 Group 2: Application of Loads by Sections but Physical

Distribution (LAM-4NO)
In the second group (Fig. 1(b)), load resultants are no longer applied to sections
directly by using RBE or by a few nodes, but are physically distributed across
the nodes of some sections [5]. An approach of this method was presented by [2],
applying inertial and aerodynamic loads to sections separately. Another example

of this approach was provided by the TENSYL company [4] and is going to be
the method used to compare in the sensitivity analysis. In this approach, the
forces and moments resultants are distributed between four nodes of each section,
assuming simplified phenomenological laws.
Loads are applied by section and distributed across the section in such a way
that sections stay deformable. External forces and moments extracted from the
beam FEM are distributed on four points of the corresponding ten shell FEM
sections. As explained by Caous [5], forces at four points are applied as shown
in Fig. 2(b). Also, some relations are made to ensure that, forces in the drag
direction on all four points are equal and lift force is twice as high on the leading
edge as on the trailing edge, depending on the location of points C and D. This
method has been validated by the certification body DNV-GL to apply loads to
a shell FEM of a wind turbine blade as part of a blade assessment [5].
The two methods that have been described previously, are subjected to a
sensitivity analysis, in order to, qualify their effect; Firstly, in the displacement
at the tip of the structure and finally, on the stress in one ply of the material,
knowing that both methods introduce stress concentration, originated by the
nature of the loading application.
3 Shell Finite Element Model

In order to simplify the shell FEM, the wind turbine blade was substituted
by a shell cylinder FEM with a constant diameter of 1 m, a length of 90 m
and a constant thickness of 0.02 m. The software used to analyze the shell
cylinder FEM was the open-source Code-Aster [6]. As a boundary condi-
tion, all degrees of freedom at the bottom of the cylinder were fixed. To
apply the loads from the beam FEM, ten selected sections are located at
Z = (0, 10, 20, 30, 40, 50, 60, 70, 80, 90) m and only for the second method, the
4 nodes are located at Θ = (0◦ , 90◦ , 180◦ , 270◦ ) degrees.
The composite materials used in the shell model were QQ1 and P2B, their
properties are listed in Table 1 and the distribution under the shell FEM are
2 plies of P2B and QQ1 with a thickness of 0.01 m each. Both materials were
extracted from SNL/MSU/DOE composite materials database [16].
4 Sensitivity Analysis: Morris Method

The sensitivity analysis aims to identify the parameters more influence into a
model and, is often used in cases were the models have a big number of input
parameters. Screening Methods give a measure of qualitative sensitivity, that
means, they can order the input parameters by their importance without qual-
ification. The method proposed by Morris [17] is the most popular and is used
in this study.
Morris Method provides qualitative sensitivity measures to determine
whether the effect of the uncertain parameter xi on the model response y. These
Table 1. Mechanical properties of composite material QQ1 and P2B [16].
Material properties QQ1 P2B

Longitudinal Young’s modulus E1 (GPa) 33.1 101
Transversal Young’s modulus E2 (GPa) 17.1 8.86
Poisson’s ratio ν12 0.27 0.22
Shear modulus G12 (GPa) 6.29 6.37
Shear strength S (MPa) 141 137
Longitudinal tensile strength XT (MPa) 843 1546
Longitudinal compressive strength XC (MPa) 687 1047
Transversal tensile strength YT (MPa) 149 80
Transversal compressive strength YC (MPa) 274 240
3
Density ρ (kg/m ) 1919 1570
sensitivity measures are based on the computation of the elementary effect EEi
for each input parameter Xi , which is defined by the finite difference derivative
approximation:
y(x1 , ..., xi−1 , xi + Δi , xi+1 , xn ) − y(x1 , ..., xn )

EEi = (1)
Δi

where Δi is the variation size taken in the subset (p−1) 1
, ..., 1 − (p−1)
1
, where
p is the number of levels, for example 5 levels. If the model has n uncertain
parameters, the associated elementary effects are calculated using an individu-
ally randomized one-factor-at-time experiments, composed of n+1 experimental
points. The impact of changing one factor at a time is evaluated in turn. Consid-
ering m different experimental design plans, a statistical analysis provides the
mean μ∗ of the elementary effects. This measure estimate the global influence of
the parameter xi , a high value of μ∗ indicates that the input parameter gave an
important overall influence on the output. This sensitivity measure is defined by:
1
j=m
μ∗ = |EEij | (2)
m j=i
In addition, the standard deviation σ ∗ of the elementary effects is esti-

mated by:

j=m
1
σ∗ = (EEij − μ∗ )2 (3)
m j=1
A high value of the standard deviation σ ∗ indicates that the parameter xi is

involved in interaction with other parameters or whose effect is nonlinear. These
sensitivity measures, the absolute expected value μ∗ and the standard deviation
σ ∗ , are normalized between [0, 1] using the equations bellow:
μ∗ −min(μ∗)
μN = max(μ∗)−min(μ∗)
σ ∗ −min(σ ∗ ) (4)
σN = max(σ ∗ )−min(σ ∗ )
In this study, the Morris method is used to carry out the sensitivity analysis,
that allows determining the parameters more influential, and also, decrease the
number of input parameters that will be considered thereafter other sensitivity
analysis, as Sobol sensitivity approach. Screening analysis is implemented using
the software OpenTurns [1] that is incorporated in the Open Source software
Salome-Meca that allows the interaction between Code-Aster and OpenTurns.
Results obtained using this method are presented in the next section.
5 Sensitivity Analysis of Load Application Methods

The Load Application Methods explained before LAM-RBE and LAM-4NO are
analyzed and compared using Morris screening Method. Two case study are
considered for both methods. In the first case study, the input parameters are the
i
applied forces at each section of the shell FEM (FX , FYi , FZi , MX
i
, MYi and MZi ),
the thickness and fibre orientation of both composite materials (EP1 , EP2 , ORI1
and ORI2 ). The output parameters for both load application methods are the
displacement in Y direction at the top of the cylinder and the stress σ22 at the
first ply of the composite material. However, for LAM-RBE method the stress is
selected at a node in the 5th section at 90◦ (S590◦ ) and for LAM-4NO method
in the 7th section at the same angle (S790◦ ). In the second case study, the input
parameters are only the applied loads.
To generate the forces in X and Y direction is used the following equations:

FX (z) = 12 ρAV (z)2 cd
(5)
FY (z) = 12 ρAV (z)2 cl
where, ρ is the density of air, A reference area, V (z) the wind speed at the height
z, cd drag coefficient and cl lift coefficient. The wind speed is calculated by using
the average wind gradient at the boundary of the atmosphere surface developed
by Panofsky and Dutton [18]:

q
z
ū(z) = ū(zref ) (6)
zref
selecting, ū(zref ) as 11 m/s, zref as 10 m and q as 0.27 from the Hellmann
exponent in stable air above open water surface [13]. The lift and drag coefficient
were selected as 1 and 0.47 respectively, in order to generate the same distribution
of forces in all sections but with different magnitudes in both directions.
Both forces from Eq. 5 are used in a 1D beam FEM of the cylinder created
in Code Aster. Therefore, the forces and moments at each section that will be
applied in the shell FEM cylinder are extracted. In the Morris screening method,
all the input parameters are considered uniform random variables with interval
of ±10% of the initial values.
5.1 Case 1(a): Materials Properties and Loads as Inputs and

Displacement as Output
The results of the first case for displacement output is shown in Fig. 3(a), for
both load application methods, the thicknesses (EP1 , EP2 ) of respectively QQ1
and P2B are the most sensitive input parameters. For LAM-RBE, EP1 of QQ1
has the strongest linear effect (μ∗ = 100.99), but for LAM-4NO, EP1 has the
strongest non-linear effect (σ ∗ = 33.79). In fact, the sensitivity of the thicknesses
EP2 is the same for both LAM. These results show that the thickness of QQ1
and P2B have the highest non-linear effect when the load application method
LAM-4NO is used. Otherwise, the thickness of QQ1 and P2B have a linear effect
when the load application method LAM-RBE is used. In other words, for the
displacement output, the thicknesses have a different influence effect depending
on the selected LAM.
1 1
RBE RBE
N N
0.9 4NO 0.9 4NO
N N
RBE RBE
N N
0.8 4NO 0.8 4NO
N N
0.7 0.7
0.6 0.6
* Normalized
* Normalized
0.5 0.5
* and
* and
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
EP1 EP2 Other Variables EP1 EP2 Other Variables
(a) Displacement (b) Stress 22
Fig. 3. Histogram of sensitivity measures of load application methods RBE and 4NO
for the cases 1.
5.2 Case 1(b): Materials and Loads as Inputs and Stress as Output
Parameters
In Fig. 3(b) for the level of σ22 , For the LAM-RBE the thickness of both materials
have the highest sensitivity effect, and EP1 have the maximum values for both
μ∗ = 1.4e7 and σ ∗ = 1.5e7, and also having the highest non-linear effect. As
well, for LAM-4NO the thickness have a non-linear effect but not as sensitive as
LAM-RBE.
5.3 Case 2(a): Loads Only as Input Parameters and Displacement

as Output Parameter
In order to analyse the interaction between the load actions in the load applica-
tion methods, this case consider only the load actions as input parameters. In
fact, as shown before, if the load actions are analyzed with the thickness, their
effect can be neglected.
Figure 4(a) shows the results for case 2. It can be noticed that, for LAM-RBE
and LAM-4NO the forces in Y direction have a linear effect over the response
of the shell model, and the effect is higher at the top of the shell FEM and it
decrease if the location of the force is near to the bottom. It should be noted
that, FY9 for LAM-4NO have the maximum sensitivity effect μ∗ = 0.0025. All
the others loads have a insignificant effect, because their maximum value is
σ ∗ = 1.00082e− 13 at MZ9 , that compared with μ∗ is insignificant.
1 1
RBE RBE
N N
0.9 4NO 0.9 4NO
N N
N
RBE N
RBE
0.8
N
4NO 0.8
N
4NO
0.7 0.7
0.6 0.6
* Normalized
* Normalized
0.5 0.5
* and
* and
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
Fx0
Fx1
Fx2
Fx3
Fx4
Fx5
Fx6
Fx7
Fx8
Fx9
Fy0
Fy1
Fy2
Fy3
Fy4
Fy5
Fy6
Fy7
Fy8
Fy9
Fz0
Fz1
Fz2
Fz3
Fz4
Fz5
Fz6
Fz7
Fz8
Fz9
Mx0
Mx1
Mx2
Mx3
Mx4
Mx5
Mx6
Mx7
Mx8
Mx9
My0
My1
My2
My3
My4
My5
My6
My7
My8
My9
Mz0
Mz1
Mz2
Mz3
Mz4
Mz5
Mz6
Mz7
Mz8
Mz9
Fx0
Fx1
Fx2
Fx3
Fx4
Fx5
Fx6
Fx7
Fx8
Fx9
Fy0
Fy1
Fy2
Fy3
Fy4
Fy5
Fy6
Fy7
Fy8
Fy9
Fz0
Fz1
Fz2
Fz3
Fz4
Fz5
Fz6
Fz7
Fz8
Fz9
Mx0
Mx1
Mx2
Mx3
Mx4
Mx5
Mx6
Mx7
Mx8
Mx9
My0
My1
My2
My3
My4
My5
My6
My7
My8
My9
Mz0
Mz1
Mz2
Mz3
Mz4
Mz5
Mz6
Mz7
Mz8
Mz9
(a) Displacement (b) Stress 22
Fig. 4. Histogram of sensitivity results of load application methods RBE and 4NO for
case 2.
5.4 Case 2(b): Loads Only as Input Parameters and Stress as

Output Parameter
Figure 4(b) shows the results of the sensitivity measures when the stress is
the output of the model. Firstly, in LAM-RBE the stress is measured at
S590◦ , in this scenario, all the forces in Y direction, above this section have
the highest linear sensitive effect, moreover, all of them have a near μ∗ =
(899.72, 899.39, 899.17, 899.08) (FY6 , FY7 , FY8 and FY9 respectively), bellow this
section only FY4 has a linear effect, but, not as significant as the others forces.
For LAM-4NO, where the stress was measured at S790◦ , the sensitivity effect is
a little different than the case of LAM-RBE, not all the forces above the section
has the same linear effect, but the force at the top for both LAM have the most
significant linear effect. In the other hand, In this method MZ7 has a significant
effect, having the maximum values μ∗ = 429.38 and σ ∗ = 0.00034.

A Morris screening analysis was used to study two Load Application Methods,
First one, applying the load from beam to a shell FEM by sections into one
master node using RBE and the second, applying the loads by sections but
physical distribution on sections (4 nodes). A simplified representation of a wind
turbine blade was used to compare both methods, creating a cylinder shell FEM
having a constant distribution of two composite materials QQ1 and P2B.
Different cases were analyzed for two output parameters (displacement and
maximal stress σ22 ) and for two groups of the input parameters. In the first case,
input parameters are the loads by sections from beam FEM and the thickness
and fiber orientation of both composite materials. The second case, the input
parameters are only the loads in order to qualify the effect of each LAM, the
outputs are the same, such as the displacement and maximal stress σ22 ). The
sensitivity results show that the thickness of the material has a huge impact in
both responses of the LAMs. For the LAM-4NO, the thickness has a non-linear
effect and for the LAM-RBE a linear effect for displacement output and a non-
linear effect for stress output. In fact, the thicknesses have a linear effect on the
displacement output in the LAM-RBE, because in the method the assumption
of making the sections as RBE, thus all nodes have constrained their degrees of
freedom and will move in the same direction and magnitude without changing
the form of the section. Nevertheless, in the LAM-4NO, each node is free to
move and deform the section. For stress output, both methods have the same
non-linear effect because both methods introduce stress concentration, LAM-
RBE by making the section a rigid body element and LAM-4NO by applying
the load directly to a node.
The sensitivity results of the second case study show that for displacement
output the applied forces in the same direction have the most sensible effect.
The greater effect is obtained when the force is applied at the top of the shell
FEM and decreases if it gets near the bottom. On the other hand, the applied
forces have a the highest sensitive effect, but always with a linear effect.
Ongoing works aims to make a sensitivity analysis on a LAM from group
four [5,11,14], because they take into account the physical consistency of loads:
aerodynamic and inertial loads. Then, we can compare with the two already
analyzed and select the most suitable to be used in a Reliability-Based Design
Optimization, and make a sensitivity analysis using Sobol Index to quantify
the sensitivity effect and use the results to select the input parameters most
significant, to create a surrogate model of the responses.
References
1. Baudin, M., Dutfoy, A., Iooss, B., Popelin, A.L.: OpenTURNS: an industrial soft-
ware for uncertainty quantification in simulation. In: Handbook of Uncertainty
Quantification, pp. 2001–2038 (2017)
2. Bottasso, C.L., Campagnolo, F., Croce, A., Dilli, S., Gualdoni, F., Nielsen,
M.B.: Structural optimization of wind turbine rotor blades by multilevel
sectional/multibody/3D-FEM analysis. Multibody Syst. Dynam. 32(1), 87–116
(2014)
3. Branner, K., Berring, P., Berggreen, C., Knudsen, H.W.: Torsional performance of
wind turbine blades–part ii: Numerical validation. In: 16th International Confer-
ence on Composite Materials, Anonymous, pp. 8–13 (2007)
4. Caous, D., Valette, J.: Methodology for G1 blade assessment. Technical report,
TENSYL, La Rochelle (2014)
5. Caous, D., Lavauzelle, N., Valette, J., Wahl, J.C.: Load application method for
shell finite element model of wind turbine blade. Wind Eng. 42(5), 467–482 (2018)
6. EDF: Finite element code aster, analysis of structures and thermomechanics for
studies and research, year = 1989–2017. Open source on www.code-aster.org
7. Forcier, L.C., Joncas, S.: Development of a structural optimization strategy for
the design of next generation large thermoplastic wind turbine blades. Struct.
8. Griffith, D.T., Ashwill, T.D.: The sandia 100-meter all-glass baseline wind tur-
bine blade: Snl100-00. Sandia National Laboratories, Albuquerque, Report No.
SAND2011-3779, p. 67 (2011)
9. Guideline, G., Lloyd, G.: Guideline for the certification of wind turbines. German-
ischer Lloyd Wind Energie Gmb H, Hamburg (2010)
10. Haselbach, P.U., Bitsche, R., Branner, K.: The effect of delaminations on local
buckling in wind turbine blades. Renew. Energy 85, 295–305 (2016)
11. Hu, W., Choi, K., Zhupanska, O., Buchholz, J.H.: Integrating variable wind load,
aerodynamic, and structural analyses towards accurate fatigue life prediction in
composite wind turbine blades. Struct. Multid. Optim. 53(3), 375–394 (2016)
12. Jonkman, J.M., Buhl, Jr. M.L.: Fast user’s guide-updated august 2005. Technical
report, National Renewable Energy Laboratory (NREL) (2005)
13. Kaltschmitt, M., Streicher, W., Wiese, A.: Renewable Energy: Technology, Eco-
nomics and Environment. Springer Science & Business Media, Heidelberg (2007)
14. Knill, T.J.: The application of aeroelastic analysis output load distributions to
finite element models of wind. Wind Eng. 29(2), 153–168 (2005)
15. Larsen, T.J., Hansen, A.M.: How 2 HAWC2, the user’s manual. Technical report,
Risø National Laboratory (2007)
16. Mandell, J., Samborsky, D.: SNL/MSU/DOE composite material fatigue database
mechanical properties of composite materials for wind turbine blades version 25.0.
Montana State University (2016)
17. Morris, M.D.: Factorial sampling plans for preliminary computational experiments.
Technometrics 33(2), 161–174 (1991)
18. Panofsky, H.A.: Atmospheric Turbulence. Models and Methods for Engineering
Applications. 397 (1984)
Transportation, Logistics, Resource
Allocation and Production Management
A Continuous Competitive Facility
Location and Design Problem for Firm
Expansion
Boglárka G.-Tóth1(B) , Laura Anton-Sanchez2 , José Fernández3 ,

Juana L. Redondo4 , and Pilar M. Ortigosa4
1
Department of Computational Optimization, University of Szeged, Szeged, Hungary
boglarka@inf.szte.hu
2
Department of Statistics, Mathematics and Informatics,
Miguel Hernández University, Elche (Alicante), Spain
l.anton@umh.es
3
Department of Statistics and Operations Research,
University of Murcia, Murcia, Spain
josefdez@um.es
4
Department of Informatics, University of Almerı́a, Almerı́a, Spain
{jlredondo,ortigosa}@ual.es
Abstract. A firm wants to expand its presence in a given geographical

region. The available budget can be invested in opening a new facility
and/or modifying the qualities of the existing firm-owned facilities. The
firm can also close some of its existing facilities in order to invest the
money formerly devoted to them to its other facilities or to the new one
(in case it is finally open). A MINLP formulation is proposed to model
this new problem. Both an exact interval branch-and-bound method and
an ad-hoc heuristic are proposed to solve the model. Some computational
results are reported.
Keywords: Facility location · Competition · Quality · MINLP ·

Interval analysis · Heuristic
1 The Model
When locating a new facility in a competitive environment, both the location
and the quality of the facility need to be determined jointly and carefully in
order to maximize the profit obtained by the locating chain. This fact has been
This research has been supported by grants from the Spanish Ministry of Economy
and Competitiveness (MTM2015-70260-P, and TIN2015-66680-C2-1-R), the Hungar-
ian National Research, Development and Innovation Office - NKFIH (OTKA grant
PD115554), Fundación Séneca (The Agency of Science and Technology of the Region
of Murcia, 20817/PI/18) and Junta de Andalucı́a (P12-TIC301), in part financed by
the European Regional Development Fund (ERDF).
https://doi.org/10.1007/978-3-030-21803-4_100
1014 B. G.-Tóth et al.
highlighted in [2] among other papers. However, when a chain has to decide how
to invest in a given geographical region, it may also invest part of its budget in
modifying the quality of other existing chain-owned centers (in case they exist)
up or down, or even in closing some of those centers in order to allocate the
budget devoted to those facilities to other chain-owned facilities or to the new
one (in case the chain finally decides to open it). In this paper, we extend the
single facility location and design problem introduced in [2] to accommodate
these possibilities as well.
The scenario is as follows. A chain has to decide how to invest its budget B in
a given area of the plane in order to maximize its annual profit. It may open one
new facility and/or close and/or modify the quality of its existing chain-owned
facilities. Let us assume that there already exist m facilities offering the same
goods or product in the area and that the first k of those m facilities belong to
the expanding chain (k < m). We assume k > 0, otherwise, the model reduces
to that in [2] when the chain is a newcomer. It is assumed in this paper that the
demand is fixed and concentrated at n demand points, although a similar model
with variable demand could also be considered (see [9]). Hence, the locations
pi and buying power wi at the demand points are known. The location fj and
present quality α̃j of the j-th existing facility are also known, for j = 1, . . . , m.
In [2] it was assumed that the qualities of the existing chain-owned facilities,
as well as the quality α0 of the new facility to be located, were within the interval
[αmin , αmax ], where αmin > 0 (resp. αmax ) was the minimum (resp. maximum)
value that the quality of a facility run by the chain could take in practice. Here,
we will assume the same, but now we have to take into account the closing of
an existing chain-owned facility, or not opening any new facility. To do it, we
will use a binary variable yj , that is, 1 if the j-th facility is kept open, and 0 if
it is closed (or not open for j = 0). In the later case, the quality of the facility
does not play any role, as the attraction of the facility will be 0, as we will see.
Notice that initially, αj = α̃j ∈ [αmin , αmax ], j = 1, . . . , k. Since the j-th facility
is already established, its area can hardly be modified. Hence, most likely its
j
quality can be upgraded from α̃j up to certain level αmax ≤ αmax . Hence, we
j
will assume that αj ∈ [αmin , αmax ] for all j ∈ {1, . . . , k}, where αj denotes the
final value for the quality of the j-th facility. For j = 0, which refers to the new
facility, we have that α0 ∈ [αmin , αmax ].
Of course, some types of costs must be taken into account, too. The most
obvious one is related to the opening of the new facility at a given location f0 with
a quality α0 (in case it is open). This annualized cost will be denoted by G(f0 , α0 )
and it is only incurred if the new facility is actually open. In that case, the actual
cost depends on the location and the quality of the facility. Analogously, we also
have to pay the annualized costs Aj , j = 1, . . . , k, of the facilities already open,
in case they are kept open in the current year. Conversely, the closing of an
existing facility j (j = 1, . . . , k) also implies a cost, Cj , as this usually implies
dismantling the facility, moving materials and furniture to another place, etc.
Another cost is incurred when the quality of an existing facility is varied, as this
usually requires some investment. Again, this annualized cost, Vj (αj ) should be
A Location Model for Firm Expansion 1015
only taken into account when a variation in the quality occurs, and in that case,
the amount of the investment depends on how much the new quality αj of the
facility differs from the present quality α̃j . Finally, we also have to consider the
annual cost Rj (αj ) of running the facility j when its quality is αj .
The costs of the chain,
k

T (ns) = (yj (Aj + Rj (αj ) + Vj (αj )) + (1 − yj )Cj ) (1)
j=1
+y0 (G(f0 , α0 ) + R0 (α0 )), (2)
include the annualized cost of having open the existing facilities (Aj ) plus the
annual cost of operating them (Rj ) plus the annualized cost of varying their
qualities (Vj ) or the cost of closing them (Cj ), (given by (1)), and the annualized
cost of opening and the annual cost of operating the new facility, in case it is
open (see (2)).
For the ease of notation, the variables (f0 , α0 , . . . , αk , y0 , . . . , yk ) of the model
will be denoted by ns. Other notation needed for the mathematical formulation
are the Euclidean distance between demand point pi and facility fj , dij (i =
1, . . . , n, j = 1, . . . , m) and similarly, the distance between demand point pi and
the new facility, di (f0 ). Besides, gi (·) is a non-negative non-decreasing function
which transforms the distance in the measurement of the attraction.
The patronizing behavior of customers is probabilistic, that is, customers’
demand is split among the facilities proportionally to the attraction they feel
for them. The attraction (or utility) that a demand point feels for a facility
depends on both the location of the facility and its quality, and may vary from
one demand point to another, as indicated by the parameter γi . At present, the
attraction (or utility) that demand i feels for facility j is ũij = γi α̃j /gi (dij ).
When the quality changes to αj (or is α0 for the new facility), it is given by
γi αj γi α0
uij (αj ) = yj ui0 (f0 , α0 ) = y0
gi (dij ) gi (di (f0 ))
Notice that due to yj , the attraction is 0 whenever a facility is closed (or not
open).
Based on these assumptions, the market share captured by the chain is
n k
ui0 (f0 , α0 ) + j=1 uij (αj )
M (ns) = wi k m . (3)
i=1 ui0 (f0 , α0 ) + j=1 uij (αj ) + j=k+1 ũij
The problem (P ) of profit maximization can be formulated as follows:
max Π(ns) = F (M (ns)) − T (ns) (4)
s.t. T (ns) ≤ B (5)
di (f0 ) ≥ dmin
i , i = 1, . . . , n (6)
f0 ∈ S ⊂ R 2
(7)
j
αj ∈ [αmin , αmax ], j = 0, . . . , k (8)
yj ∈ {0, 1}, j = 0, . . . , k (9)
where F (·) is a strictly increasing differentiable function which transforms the

market share into expected sales, and Π(ns) is the profit obtained by the chain.
The total cost defined in (1)–(2) is constrained by the budget B in (5). The
parameter dmin i > 0 is a given threshold, which guarantees that the new facility
is not located on top of demand point pi , see (6) (due to demand aggregation
(see [3]) pi is a point which usually represents a set of customers who occupy
a given area). Constraints (6)–(7) define the region of the plane where the new
facility can be located.
In this paper we assume function F to be linear, F (M (ns)) = c·M (ns), where
c is the income per unit of goods sold. As already highlighted in [2], function
G(f0 , α0 ) should increase as f0 approaches one of the demand points, since it
is rather likely that the opening cost of the facility will be higher around those
locations (due to the value of land and premises, which will make the cost of
buying or renting the location higher). On the other hand, G should be a convex
function in the variable α0 , since the more quality we expect from the facility
the higher the costs will be, at an increasing rate. We assume nG to be separable,
in the form G(f0 , α0 ) = G1 (f0 ) + G2 (α0 ), where G1 (f0 ) = i=1 Φi (di (f0 )), with
α0
Φi (di (f0 )) = wi /((di (f0 ))φi0 + φi1 ), φi0 , φi1 > 0, and G2 (α0 ) = e β0 +β1 − eβ1 ,
with β0 > 0 and β1 given values. Other possible expressions for F and G can be
found in [2,11].
The annualized cost of varying the quality of the j-th facility from the present
level α̃j to αj , Vj (αj ), should decrease in the interval [αmin , α̃j ) and increase in
(α̃j , αmax ], since the bigger the difference |αj − α̃j |, the bigger the investment
required to do the modifications. Usually, Vj (α̃j +αj ) > Vj (α̃j −αj ), as upgrading
the quality of a facility is more difficult (more expensive) than downgrading it.
Similarly to G, Vj should be a convex function in the interval (α̃j , αmax ], since
the more quality we expect from the facility the higher the costs will be, at
an increasing rate. And the same is valid in the interval [αmin , α̃j ). Notice also
that, regardless how small the variation is, a fixed cost vj > 0 has to be paid
if a variation is carried out, since the modification of the quality of the facility
usually implies a temporary closure of (a part of) the facility. If vj is too high,
in practice this will prevent any variation in the quality of the facility. Also,
the fixed cost will prevent too small variations. A possible expression for Vj (αj )
could be the following one:
⎧1
⎪
⎨ δj (G2 (2α̃j − αj ) − G2 (α̃j )) + vj if αj < α̃j
⎪
Vj (αj ) = 0
⎪
⎪ if αj = α̃j
⎩
G2 (αj ) − G2 (α̃j ) + vj if αj > α̃j
In the previous expression, the parameter δj > 0 determines how much cheaper
is decreasing the quality as compared to increasing it.
Concerning Rj (αj ), which gives the annual operating cost of facility j when
its quality is αj , it should be nondecreasing as αj increases. Its functional
form may vary from convex to concave, linear, piecewise linear or other forms
depending on the type of facility. In this paper we will assume a linear form,
Rj (αj ) = oj αj , with oj > 0 a given constant.
Problem (P ) is a Mixed-Integer NonLinear Programming problem (MINLP).
Hence, solving it is a challenge from the optimization point of view, and global
optimization tools are required to cope with it.
2 Solving the Location Model

2.1 An Exact Interval B&B Method
Branch-and-bound methods based on interval analysis [5,6] have been applied
to solve several continuous facility location problems (each variable is assumed
to vary within a real compact interval) [1,11,12]. In this paper, we propose to
extend interval B&B methods to cope with MINLP problems. The basic idea is
to make use of the differentiability of the function when the integer variables are
assumed to be continuous, and then, when the moment of discarding subregions
arrives, to take the integrality of the variables back into account so as not to
remove inappropriate parts.
Following the standard notation suggested in [7], boldface will denote inter-
vals, lower case will be used for scalar quantities or vectors (vectors are then dis-
tinguished from components by use of subscripts), and upper case for matrices.
Brackets “[·]” will delimit intervals, while parentheses “(·)” vectors and matrices.
Underlines will denote lower bounds of intervals and overlines give upper bounds
of intervals. For example, we may have the interval vector z = (z 1 , . . . , z n )T ,
where z i = [z i , z i ]. The set of intervals will be denoted by IR, and the set of
n-dimensional interval vectors, also called boxes, by IRn .
Definition 1. A function f : IRn → IR is said to be an inclusion function for
f : Rn → R provided {f (z) : z ∈ z} ⊆ f (z) for all boxes z within the domain of
f.
Next we describe some of the discarding tests that we have modified to handle
MINLP problems. In what follows, z will denote the vector of variables of the
problem (in our model, z is the 2k + 4-dimensional vector ns).
Feasibility Test Let us denote the constraints in (5)–(7) by gj (z) ≤ 0, j =

1, . . . , r. We say that z certainly continuously satisfies the constraint gj (z) ≤ 0
if g j (z) ≤ 0, and that certainly does not continuously satisfy it if g j (z) >
0. z is said certainly continuously feasible if it certainly continuously satisfy
all the constraints gj (z) ≤ 0, j = 1, . . . , r, certainly continuously infeasible if
does not continuously satisfy at least one of those constraints, and continuously
undetermined otherwise. A box is said certainly continuously strictly feasible if
g j (z) < 0, j = 1, . . . , r. Observe that if z is certainly continuously feasible, then
any point z ∈ z satisfies all the constraints gj (z) ≤ 0, j = 1, . . . , r. In particular,
from any integer assignment an (integer) feasible point for (P ) can be obtained.
The feasibility test discards boxes which are certainly continuously infeasible.
Monotonicity Test Let z be a strictly continuously feasible box. Let ∇Π(z) =

(∇1 Π(z), . . . , ∇2k+4 Π(z))T be an inclusion of the gradient of the objective
function Π over the box z. Then,
1. If 0 ∈ ∇i Π(z) for a continuous variable zi , then the box can be discarded.

2. If 0 ∈ ∇i Π(z) for an integer variable zi , then the box can be narrowed to
the facet
(a) (z 1 , . . . , z i , . . . , z 2k+4 ) if ∇i Π(z) < 0,
(b) (z 1 , . . . , z i , . . . , z 2k+4 ) if ∇i Π(z) > 0.
We assume that the endpoints of integer variables are integer numbers.
Notice that when the variable zi is integer the box cannot be removed since
the gradient may change its sign in the box (z 1 , . . . , [z i − 1, z i ], . . . , z n ) or
(z 1 , . . . , [z i , z i + 1], . . . , z n ), respectively.
Other discarding tests have also been designed following the same idea.
2.2 A Heuristic Evolutionary Algorithm
When designing a heuristic to solve the problem, some questions need to be

analyzed. For instance, should the new facility be opened? How to distribute
the available budget among the opened facilities, i.e., which ones should be
upgraded/downgraded? Should any of the existing facilities be closed in order to
invest the money formerly devoted to them to the other chain-owned open facil-
ities? In that case, which facility or facilities should be closed? When answering
these questions, it is important to know which the contribution of each facility
to the objective function is. As a surrogate, for each facility we compute its prof-
itability, to be understood as the ratio between the income it generates and the
cost incurred by the facility. To be more precise,
n
ui0 (f0 , α0 )
F wi k m
profitab0 =
G(f0 , α0 ) + R0 (α0 )
n
uij (αj )
F wi k m
profitabj = , j > 0.
Aj + Rj (αj ) + Vj (αj )
A pseudocode of the algorithm is given next. In the beginning, the available

budget, B̆, is the initial annual chain’s budget, B, minus the budget required to
keep all the existing chain-owned facilities opened with their present qualities
α̃j , B̃.
Algorithm 1. Heuristic scheme

1 Close chain-owned facilities, even if B̆ > 0, in case Π(ns) improves.
/* If there is not enough budget to keep the existing
facilities opened, then some of them must be downgraded or
closed down. */
2 while B̆ < 0 and not all the chain-owned facilities are closed do
3 get more budget
4 repeat
/* We try to get more budget by downgrading or closing the
least profitable facility. */
5 if B̆ = 0 then
6 get more budget
7 Implement the best of the following two options:
/* Option I */
8 improve the quality of the chain-owned facilities.
9 open a new facility (it requires at least a budget B min ).
10 reopen previously closed facilities.
/* Option II */
11 if B̆ < B min and the new facility is not open then
12 repeat
13 get more budget
14 until B̆ ≥ B min or all the chain-owned facilities are closed
15 open a new facility (it requires at least a budget B min ).
16 improve the quality of the chain-owned facilities.
17 reopen previously closed facilities.
18 until Π(ns) does not improve or B̆ > 0
If in a given iteration more budget is required, then the quality of the least
profitable facility should be reduced down to a given value, or even the facility
could be closed (whatever is better). This procedure, called get more budget
in Algorithm 1, provisionally reduces the quality of the least profitable facility
whose quality can be reduced, j1 , down to αj1 = max{αmin , αjlp1 } where αjlp1 is
the solution of the equation prof itabj1 = prof itabj2 and j2 is the facility with
the second least profitable ratio. Then, it computes an estimation of the profit
that can be obtained with this reduction with a small number of iterations of
a Weiszfeld-like method. Analogously, the procedure computes an estimation of
the profit that can be obtained with the closure of the least profitable facility
and chooses the best option.
Notice that every time that a facility is closed or its quality is reduced, the
profitability ranking should be recomputed, as the market share captured by
the open facilities may change. Also, every time a facility is closed, a forbidden
area surrounding the facility should be included in the model, so as to avoid the
new facility to be located just in the area where a facility has just been closed.
Procedure open locates a new facility using the available budget, that must be
at least B min , using a modification of the multi-start Weiszfeld-like algorithm
in [10].
The available budget at a given iteration is distributed using a greedy strat-
egy. The most profitable facility is allowed to vary its quality as much as needed
(provided that this does not surpass the budget and that the profit obtained
by the chain, Π(ns), improves). Once finished, if there is still some budget left,
the process is repeated with the second most profitable facility, and so on (pro-
cedure improve). In addition, if there are previously closed facilities, they can
be reopened if there is enough budget for it and the chain’s profit improves
(procedure reopen).
3 Computational Studies
All the computational results in this paper have been obtained under Linux on
an AMD Athlon(tm) 64 X2 with 2.2 GHz CPU and 2 GB memory. The algo-
rithms have been implemented in C++. For the interval B&B method, we used
the interval arithmetic in the PROFIL/BIAS library [8], and the automatic dif-
ferentiation of the C++ Toolbox library [4].
We have generated a set of random problems in order to evaluate the perfor-
mance of the algorithms. They all have n = 100 demand points, and the number
m of existing facilities and the number k of those facilities belonging to the chain
considered were m = 3, 5 and k = 1, 2.
For each setting, 10 problems were generated by randomly choosing the
parameters of the problems uniformly within given intervals (or computed from
other parameters). The parameters are pi , fj ∈ S = ([0, 10], [0, 10]), ωi ∈ [0, 10],
γi ∈ [0.75, 1.25], α̃j ∈ [0.5, 5], φi0 = 2, φi1 ∈ [0.5, 1.5], β0 ∈ [7, 9], β1 ∈
[5, 5.5], δj ∈ [3, 5], c ∈ [12, 14], Ai ∈ [8, 11], Ci = Ai /2, oj = 20Aj .
These settings were obtained by varying up and down the value of the param-
eters of the quasi-real problem studied in [13], where a case of location of super-
markets in southeast Spain is studied. Nevertheless, when applying the model
to a particular problem those parameters have to be fine-tuned.
The search space for every problem was f0 ∈ S and αj ∈ [0.5, 5], j = 0, . . . , k.
For each problem, we computed the difference between the optimal objective
value obtained by the B&B and the best solution obtained by the heuristic in
10 runs, in percentage, and the number of times that the heuristic found the
best solution it could find. Table 1 shows the average values obtained for each
(m, k) setting, with the standard deviation in brackets. As shown, we solved each
problem with five different budgets. Hence, in all, 200 instances were generated.
The B&B method could solve all the instances with a relative accuracy of
0.0001. The CPU time needed for the method can be very large, although it is
not always the case. The standard deviation shows that the required time varies
extremely from a few seconds to many hours. It also shows that the difficulty
of the problems does not depend on the number of existing facilities nor on the
Table 1. Results for the problems with 100 demand points
Existing Chain Budget Difference in Times CPU seconds

facilities length
Obj (%) Found B&B Heuristic
3 1 0.8B̃ 0.21 (0.6) 2.20 (3.2) 7070 (13011) 18.1 (9.7)
B̃ 0.02 (0.0) 3.50 (3.8) 9883 (12117) 20.1 (8.4)
1.2B̃ 0.07 (0.2) 2.60 (3.3) 4406 (8300) 21.1 (10.5)
1.5B̃ 0.27 (0.6) 2.50 (4.1) 315 (432) 25.5 (10.3)
2B̃ 0.13 (0.4) 3.40 (4.7) 288 (343) 22.7 (12.2)
2 0.8B̃ 0.47 (0.9) 1.90 (3.2) 1207 (1572) 21.3 (15.8)
B̃ 0.53 (0.7) 1.00 (3.2) 3001 (6080) 22.9 (16.6)
1.2B̃ 1.49 (4.2) 1.20 (3.2) 4971 (13458) 65.4 (113.1)
1.5B̃ 0.14 (0.3) 3.30 (4.7) 6324 (13802) 51.0 (68.6)
2B̃ 0.09 (0.2) 4.00 (5.2) 6319 (13749) 45.3 (68.4)
5 1 0.8B̃ 0.51 (1.1) 1.00 (3.2) 580 (498) 17.3 (5.8)
B̃ 0.06 (0.1) 1.70 (3.2) 625 (456) 17.9 (7.0)
1.2B̃ 1.23 (3.5) 0.70 (1.1) 354 (257) 33.8 (51.0)
1.5B̃ 4.18 (11.0) 3.10 (4.3) 165 (107) 28.8 (31.5)
2B̃ 0.17 (0.5) 5.30 (3.8) 140 (180) 49.7 (69.3)
2 0.8B̃ 1.47 (4.6) 0.30 (0.9) 8888 (10474) 20.1 (11.8)
B̃ 0.10 (0.3) 0.60 (0.8) 11554 (13877) 21.3 (10.6)
1.2B̃ 1.06 (1.5) 0.90 (2.8) 7695 (12898) 24.5 (7.5)
1.5B̃ 0.01 (0.0) 1.00 (3.2) 8090 (14457) 21.9 (8.5)
2B̃ 0.02 (0.0) 1.00 (3.2) 8037 (14363) 29.3 (33.4)
All 0.51 1.72 3746 24.1
chain length. Interestingly, the cases with the setting (m = 5, k = 1) seem very
easy as compared to the others. Although we have checked the results in more
detail, we could not find any pattern in this behavior.
As we can see, the heuristic method can find a solution very close to the
optimum. In average, the difference from the global optimum is 0.5%, and in the
worst case, it is still only 4.18%. Clearly, the elapsed time is much shorter, as
on average only 24 seconds are required. Comparing the results for the different
settings, one can see no remarkable effects.
4 Conclusions and Future Research

Location Science is an important research area. Analyzing how to proceed when
a firm wants to expand its presence in a particular area is essential to achieve
success. In the model that we propose, given a budget, the company could open
a new facility, modify the qualities of its existing facilities or even close some
of them. The new formulation results in a MINLP optimization problem that

is solved through an exact interval branch-and-bound method and an ad-hoc
heuristic algorithm. Results have shown that both methods are able to solve
this mixed integer nonlinear programming problem within a reasonable time
and with good accuracy.
Future goals include to accelerate the interval branch-and-bound method by
implementing more discarding tests, and to incorporate new escaping procedures
from local optima in the heuristic method.
References
1. Fernández, J., Pelegrı́n, B.: Using interval analysis for solving planar single-facility
location problems: new discarding tests. J. Global Optim. 19(1), 61–81 (2001)
2. Fernández, J., Pelegrı́n, B., Plastria, F., Tóth, B.: Solving a Huff-like competitive
location and design model for profit maximization in the plane. Eur. J. Oper. Res.
179(3), 1274–1287 (2007)
3. Francis, R., Lowe, T., Tamir, A.: Demand point aggregation for location models.
In: Drezner, Z., Hamacher, H. (eds.) Facility Location: Application and Theory,
pp. 207–232. Springer, Heidelberg (2002)
4. Hammer, R., Hocks, M., Kulisch, U., Ratz, D.: C++ Toolbox For Verified Com-
puting I: Basic Numerical Problems: Theory, Algorithms and Programs. Springer-
Verlag, Heidelberg (1995)
5. Hansen, E., Walster, G.W.: Global Optimization Using Interval Analysis - Second
Edition, Revised and Expanded. Marcel Dekker, New York (2004)
6. Kearfott, R.: Rigorous Global Search: Continuous Problems. Kluwer, Dordrecht
(1996)
7. Kearfott, R., Nakao, M., Neumaier, A., Rump, S.M., Shary, S.P., van Hentenryck,
P.: Standardized notation in interval analysis. TOM 15(1), 7–13 (2010)
8. Knüppel, O.: PROFIL/BIAS - a fast interval library. Computing 53(1), 277–287
(1993)
9. Redondo, J., Fernández, J., Arrondo, A., Garcı́a, I., Ortigosa, P.: Fixed or variable
demand? Does it matter when locating a facility? Omega 40(1), 9–20 (2012)
10. Redondo, J., Fernández, J., Garcı́a, I., Ortigosa, P.: A robust and efficient global
optimization algorithm for planar competitive location problems. Ann. Oper. Res.
167(1), 87–106 (2009)
11. Tóth, B., Fernández, J.: Interval Methods For Single and Bi-Objective Optimiza-
tion Problems - Applied to Competitive Facility Location Problems. Lambert Aca-
demic Publishing, Saarbrücken (2010)
12. Tóth, B., Fernández, J., Csendes, T.: Empirical convergence speed of inclusion
functions for facility location problems. J. Comput. Appl. Math. 199, 384–389
(2007)
13. Tóth, B., Plastria, F., Fernández, J., Pelegrı́n, B.: On the impact of spatial pattern,
aggregation, and model parameters in planar Huff-like competitive location and
design problems. OR Spectr. 31(1), 601–627 (2009)
A Genetic Algorithm for Solving the
Truck-Drone-ATV Routing Problem
Mahdi Moeini and Hagen Salewski(B)
Chair of Business Information Systems and Operations Research (BISOR),

Technische Universität Kaiserslautern, 67663 Kaiserslautern, Germany
{mahdi.moeini,salewski}@wiwi.uni-kl.de
Abstract. In this paper, we introduce and investigate a new style of

delivery in last-mile logistics, in which we merge the existing concept of
conventional truck-based delivery with emerging technologies, i.e., drones
and autonomous robots (autonomous transport vehicles (ATVs)). More
precisely, in the Truck-Drone-ATV Routing Problem (TDA-RP), a truck,
carrying several drones and ATVs as well as the parcels, departs from
a depot, visits a given list of grid points, each of them at most once,
and returns to the depot by the end of the mission. In addition, at each
visited grid point, a set of drones and ATVs are tasked to deliver the
parcels to the customers via circumjacent operations. The objective con-
sists in serving all customers in shortest possible time. However, due
to the computational complexity of the problem, we cannot solve it by
exact methods. Hence, we suggest a Genetic Algorithm for solving the
problem and, through our computational experiments on randomly gen-
erated instances, we show the benefits of using a mixed fleet of drones
and ATVs assisting a truck.
Keywords: Traveling salesman problem · Last-mile logistics · Drone ·

Autonomous vehicle · Heuristics · Metaheuristics · Genetic algorithm
1 Introduction
A drone is an unmanned aircraft, which flies mostly autonomously and relies on
routing algorithms to find its way. Drones started playing an increasing role in
logistics [3,10]. Typically, drones can carry about 2 to 6 kg and reach speeds of
up to 70 km/h (multirotor drones) and 130 km/h (fixed wing drones). Because
of their high maneuverability and relatively ease of usage, multirotor drones are
useful for parcel deliveries to customers in urban areas. However, to overcome
their limitation in range, a dense network of depots or micro-depots (e.g., DHL’s
SkyPort [5]) to start and land the drones might be needed. Hence, an alternative
approach consists in using trucks that carry drones and assist the driver in
delivering parcels. More precisely, the advantages rely on the high-capacity cheap
long distance transportation through trucks and the possibility to charge the
limited batteries of the drones, which can have faster access to hard-to-reach
https://doi.org/10.1007/978-3-030-21803-4_101
1024 M. Moeini and H. Salewski
areas. Additionally, for delivering a single light parcel, drones are energy-efficient
and able to ignore the possibly congested road network. The combination of
trucks and drones received particular attention in the research community and
produced a number of articles ranging over different settings: a single truck
carrying one drone (e.g., [1,7,9]) up to a fleet of trucks with each truck carrying
multiple drones [11,13–15].
Parcel delivery by drones is a prominent and an emerging industry; however,
there are some issues that such deliveries are facing, e.g., sensitivity regarding
winds (menacing drones), especially in urban areas. Furthermore, high popula-
tion density requires high safety measures; in particular, if a drone fails in its
mission. Consequently, in most western countries, regulation authorities do not
allow the operation of fully autonomous drones; therefore, human operators are
needed to supervise drones and need a communication link to the drones. This
increases the true operational costs and imposes additional restriction on the
number of drones that can be operated in parallel, and the places where drones
can be used.
Another approach tries to combine trucks with smaller ground based
autonomous transport vehicles (ATVs). They cannot move as fast (up to 30 km/h
on roads or 6 km/h on side walks) as far (up to 3 to 10 km), or as free as drones.
However, since ATVs are allowed to travel in pedestrian areas and on side-walks,
they can use a different network than trucks, which might allow shorter distances.
Compared to drones, ATVs are more energy efficient, can carry much heavier
parcels (up to 40 kg), and require less space to be launched from a truck. If ATVs
should, by law, require an operator, a single one might handle more ATVs than
drones, thus, reducing the operational costs per vehicle. Prototypes of trucks
which are capable of dispatching up to six ATVs exist. Here again, the goal is to
improve the performance of last-mile delivery systems. This idea was adapted in
[2], where a routing model for a truck that dispatches ATVs at drop-off points is
formulated. The truck might replenish on ATVs at decentralized micro-depots
and after serving the customer, the ATVs continue to such depots.
Since both approaches, assisting trucks by drones and the combination of
trucks with ATVs might be beneficial, including them in a single approach
might be even more advantageous. Depending on the parcel’s weight and the
exact location of the customers, either a drone or an ATV could be used for
delivering parcels. Further, the truck is used to extend the limited range of the
ATVs and drones. The system is more versatile than the use of a truck with just
drones or ATVs. In this study, we introduce a new concept in last-mile deliv-
ery in which a truck carries a mixed fleet of drones and ATVs, such that the
truck dispatches and collects drones or ATVs that do the actual delivery from
designated points. In other words, the truck is not used for direct deliveries.
From a practical point of view, this style of delivering system is useful to serve
customers; in particular, in cases where a truck might be not allowed in certain
areas (e.g., due to ecological constraints, closed or too narrow roads, protected
areas, etc.). Such a combined system should be cheaper than the widespread
installation of micro-depots that launch drones or ATVs. Additionally, a
A Genetic Algorithm for Solving the Truck-Drone-ATV Routing 1025
truck-drone-ATV-system could be used in areas where it is not possible to install

micro-depots or until micro-depots have been installed. Further, our approach
extends the concept introduced in [2] by explicitly scheduling the pick-up of the
same ATVs that have been used for the deliveries. Due to the computational
complexity of the resulting Truck-Drone-ATV Routing Problem (TDA-RP), we
introduce a Genetic Algorithm (GA) to solve the TDA-RP. Through computa-
tional experiments on randomly generated instances, we show the advantages of
using a mixed fleet of drones and ATVs that assist a truck for delivering parcels.
The remainder of the paper unfolds as follows: In Sect. 2, we provide a formal
description of the TDA-RP. Section 3 is devoted to the presentation of our GA
for solving the TDA-RP. Computational experiments, their numerical results,
and our observations are presented in Sect. 4. The last section contains our con-
clusions and suggested future research directions.
2 Problem Description
The TDA-RP is identified by a depot, a set of customers i ∈ I (each with a

single demand), a single truck that transports different parcels, one or several
drones d ∈ D, and one or several autonomous transport vehicles (ATVs) a ∈ A.
We assume that there are two types of parcels: light (small) parcels and heavy
(large) parcels. Drones can move faster than ATVs but can only deliver light
(small) parcels; however, the ATVs are able to deliver both types of parcels.
The truck can carry a maximum of dmax drones and amax ATVs. Moreover,
due to some infrastructure restrictions, e.g., very narrow roads or ecological
constraints, the truck cannot move on all roads, i.e., it cannot deliver the parcels
to the customers. However, the truck can move between a set of so-called grid
points, located with some distance to each other.
The depot and the grid points might be represented by nodes of a complete
graph such that all of them are connected to each other. We consider the cus-
tomer locations as additional nodes that are added to the graph of grid points
and the depot. We suppose that the customer locations are not connected to
each other but they are adjacent to all of the grid points. All edges are symmet-
ric in both directions, and weighted, where the weights indicate the euclidean
distance between connected nodes. In addition, if a connection is not possible,
then the weight of the corresponding edge is equal to +∞.
In the TDA-RP, given a set of grid points, a truck starts its mission from
the depot, visits a subset of grid points, and returns to the depot by the end of
the mission. From the depot or at any grid point, the drones and/or ATVs are
dispatched to deliver parcels to the customers through circumjacent operations.
More precisely, after delivering a parcel, the drone/ATV returns to the truck
from which drone/ATV had started its mission. If the start and the landing of
the drone/ATV are at the same grid point, we call the operation a (direct) sortie.
However, if the truck has moved and the landing spot (of the drone/ATV) is
a grid point different from the dispatching one, we call the operation a jump
sortie. Furthermore, the drones and the ATVs have limited endurance (battery
capacity) and can serve only one customer per operation. But, their battery is
recharged immediately as soon as they return to the truck. The objective of
the TDA-RP consists in finding a feasible routing plan such that, all customers
are served by either drones or ATVs, and the total mission time is minimized.
Moreover, by the end of the mission, all drones and ATVs must be on the truck
and the truck must be at the depot.
Finally, we make the following additional assumptions about the behavior of
drones’ and ATVs’ in a risk-free environment [9,11,13–15]:
– A drone or an ATV has a limited battery life of E time units. After returning
to the truck, the battery life of the drone or ATV is recharged immediately
with no delay in service.
– The trucks, drones, and ATVs follow the same distance metric.
– We suppose that the service time to launch and retrieve a drone or an ATV,
as well as the service time required to serve a customer are negligible.
– We assume that the drones/ATVs are in constant flight/movement and can
not conserve battery while in flight/movement; consequently, if a truck arrives
earlier to a grid point, then the truck has to wait for its corresponding
drones/ATVs and vise versa.
– We assume that when a drone or an ATV is dispatched, then its delivery will
be successful.
– Due to technological restrictions like limited volume or the missing possibility
to securely divide the cargo hold, we assume that ATVs might only carry one
parcel, regardless of its size.
3 Solution Method
We introduce a two-step heuristic approach for solving the TDA-RP. In order

to describe the solution method, we use the example depicted in Fig. 1. In this
figure, there are one depot, two grid points, and six customers, where customers
1 to 3 receive large parcels, and customers 4 to 6 require small ones.
First, using the well-known Lin-Kernighan heuristic [8], we generate a TSP
route1 for the truck (thick-solid line on Fig. 1). Then, based on the TSP-solution,
we determine the starting and ending grid points for each delivery’s sortie using
a genetic algorithm (GA) (see [6] and references therein). If the starting and
landing grid points are identical, the parcels are delivered via a direct sortie
(e.g., for customers 1 and 3 to 5). If they are different, a jump sortie is used
(for customers 2 and 6). Included in the GA, we use a third heuristic component
1
A TSP route might be then polished to include only a subset of all grid points.
However, it is ensured that the limited range of the ATVs as well as drones are
respected and all customers can be reached from at least on grid point.
Fig. 1. An illustrative example of the TDA-RP.
to schedule the customer deliveries from each grid point. Since only ATVs are
able to deliver large parcels, this component also includes the decision if a small
parcel’s delivery should be done by a drone (e.g., for customers 4 and 6) or an
ATV (customer 5).
We design a GA that fits to the specific structure of the TDA-RP. For this
purpose, we use a direct problem representation and the fitness value defined by
the mission time. In the following, we provide a detailed description of our GA.
We initialize the genetic algorithm by creating solutions for the first genera-
tion until we reach a maximum population size of P . In this phase of the GA,
we only use direct sorties. For this purpose, we set the starting and landing grid
point for each customer in a solution: With a probability of pinit , the heuristic
uses the grid point on the truck’s TSP-route that is closest to the customer’s
location. With a probability of 1 − pinit , the direct sorties start from a different
grid point on the truck’s TSP-route. To choose grid points that are closer to the
customers’ location with a higher probability, we use an ordered set of all grid
points from which the customer could be served. We order this set by the grid
point’s distance to the considered customer and then draw the grid point using
a Poisson distribution with parameter λinit .
As a recombination operator, we use a proportional (roulette wheel) selection
and a two-point crossover to choose the parents and to generate the children,
respectively. After the recombination, we apply the mutation operator to the
entire population of all children and parents. During mutation, we introduce or
change a jump sortie of a randomly chosen customer’s delivery operation with
a predefined probability of pm . For this purpose, we change the starting grid
point (respectively, landing grid point) to an earlier (respectively, later) position
in the sequence of grid points in the truck’s TSP-route. We set a maximum
distance dmax and use a Poisson distribution with parameter λm to decide how
far down (respectively, up) the starting (respectively, landing) should move in the
sequence of the TSP-route. Whether the jump sortie is changed with respect to
the starting or landing grid point, depends on a preset probability pmd , where m
and d stand for mutation and distance, respectively. In particular, if the randomly
chosen move in the sequence is too large, i.e., would go beyond the depot in
any direction, then we limit the move to the depot, i.e., it is set as the delivery
operation’s starting or landing point. It might happen that invalid solution could
be introduced by jump sorties that use a combination of starting and landing grid
points which violate the battery limit of an ATV or drone. In any stage, whenever
an invalid solution is generated, we remove it from the resulting population.
In order to calculate the fitness of any solution, we first need to schedule the
ATVs’ and drones’ sorties at each grid point. Since the schedule needs to be
determined for each individual in the entire population, the calculation needs to
be quick. Hence, we based it on the approximation algorithm presented in [4],
where a so-called multi-fit descent heuristic aims at the approximation of the
nonpreemtive scheduling of independent tasks on a set of identical machines.
In the TDA-RP, each machine corresponds to a drone or ATV, and each task
would be a delivery from the considered grid point. Due to the fact that the
scheduling of drones and ATVs is not independent, and furthermore, jump sorties
are possible, we need to modify the approach presented in [4] as follows: First, we
approximate the solution only for the direct sorties, and independently for ATVs
and drones. In order to use as many ATVs/drones for parallel direct sorties, we
schedule any jump sorties after all direct sorties finished. In addition, as drones
are faster than ATVs, we prefer to use drones for deliveries of small parcels. In
fact, we only consider ATVs for deliveries of small parcels, if all large parcels’
deliveries have been scheduled, and if the use of an ATV to deliver a small parcel
might reduce the overall time that a truck spends at the grid point. Based on
the operations sequence we get from the approximation, we calculate the drones’
and ATVs’ landing times and the total duration a truck needs to spend at the
considered grid point. The operations at the next grid point on the truck’s route
start once the truck arrives at the next grid point, i.e., the time required to
finish all operations at the current grid point plus the time that the truck needs
to drive from the current grid point to the next one. Through the scheduling
procedure, it might happen that some grid points are not used as starting or
landing point. In these cases, the corresponding grid points are omitted by the
truck with the objective of reducing the mission time. The maximum arrival time
of the truck, ATVs, or drones at the depot is the fitness value of the solution
and corresponds to the objective function value of the TDA-RP.
We repeat the recombination and the mutation until we can find no further
improvement for a preset number of consecutive iterations. We restart the GA
for a given number of times and report the best found solution for the same route
of grid points (for now) over all independent runs as the result of the algorithm.
4 Computational Experiments
In order to evaluate the performance of the algorithm in solving the TDA-RP,

and to explore the properties of this problem, we carried out a set of computa-
tional experiments and report the results in this section.
4.1 Test Setting
We generated test instances that differ in the number of customers (25, 50, or 75)
and grid points (10 or 20). For generating an instance, we randomly place grid
points in a 20 by 20 km area with a depot at the center. We scatter customers at
random spots within the mentioned area, and ensure that each customer could be
reached from at least one grid point while respecting the endurance restrictions
of the ATVs and drones. Furthermore, the demanded parcel size is determined
randomly, i.e., each customer has a 50% probability of receiving either a large
or a small parcel. For each combination of the number of customers and grid
points, we generate 5 different instances. Considering different combinations of
the number of drones (0, 1, or 2) and ATVs (2 or 4), we have a total number
of 180 test instances [12] on which we apply the realistic technical specifications
presented in Table 1, stating that drones are 7.6 times faster than ATVs.
Table 1. Technical specifications used on the generated problem instances.
Truck ATVs Drones

Average speed 25 km/h 5 km/h 38 km/h
Energy capacity — 92 Wh 400 Wh
Endurance — 120 min 15 min
First, we carried out preliminary experiments in order to find parameter

settings for the genetic algorithm. Having this objective, we chose the parameter
combination that on average yields the best solutions while keeping the time to
solve the problem instances reasonably short. More precisely, we use a population
size of P = 100 and initialize each of the 10 independent runs of the GA for each
provided TSP tour by choosing the closest grid point with pinit = 0.8. In the
other cases, we use a Poisson distribution with parameter λinit = 0.8 to choose
from the other grid points. Furthermore, we use a mutation rate of pm = 0.8 for
the mutation operator, a maximum distance for jump sorties of dmax = 2 grid
points, and a Poisson distribution with parameter λm = 0.8 to choose between
grid points. Finally, the algorithm favors a change of the landing grid point for
jump sorties with pmd = 0.8. We terminate each run of the algorithm if the
solution has not improved for the last 10 consecutive iterations.
We implemented our algorithm in Python 3.6 and did all experiments on an
Intel Core i5-4210M Processor (2.6 GHz) with 8 GB of RAM under Windows 10
operating system.
The results of our computational experiments are presented in Table 2, where

the first four columns of the table show the data set information. Then, depend-
ing on the number of customers (#Customers), number of grid points (#Grid
points), and number of available ATVs (#ATVs) as well as drones (#Drones),
the average results of 5 instances of each combination, consisting of average mis-
sion time (in time units, typically, in hours), the average computation time (in
seconds), which accumulates the run time of the 10 independent GA runs, per-
centage of ATV use in delivering small parcels, and ratio of jump sorties over all
of them in percentage are presented.
Fig. 2. A sample result of GA in Fig. 3. A sample result of GA in pres-

absence of drone. ence of a single drone.
Comments on the results According to the results shown in Table 2, we can

make the following observations:
– Taking into account the size of the test instances, which fits to typical real-
world applications, it is interesting to note that the introduced GA is able
to provide routing plans, in a short computation time, for all instances. Two
sample results for a problem instance, with 25 customers and 20 grid points,
are depicted in Figs. 2 and 3.
– Additional number of customers requires higher mission time, which is also
approved by the numerical results. Similarly, additional number of vehicles
increases the computation time of the GA.
– Using a mixed fleet of drones and ATVs has the advantage of a consider-
able reduction in the mission time and deliveries of small parcels by ATVs.
Due to the higher speed of drones, these effects are not surprising. Finally,
larger number of vehicles (ATVs and drones) increases the possibility of jump
sorties.
Table 2. Average mission as well as solution time, ratio of ATV deliveries of small
parcels, and ratio of jump sorties, provided by the GA, for the test instances.
#Customers #Grid #ATVs #Drones Average Average ATV Jump

points mission time cpu time deliveries sorties
[time units] [sec.] [%] [%]
25 10 2 0 15,61 22,92 100,00 24,00
2 1 10,03 26,65 10,18 26,40
2 2 9,51 25,79 1,67 36,80
4 0 10,90 60,51 100,00 48,80
4 1 7,94 58,11 17,56 47,20
4 2 7,56 50,04 7,62 51,20
25 20 2 0 15,36 27,93 100,00 30,40
2 1 9,42 32,66 3,10 32,80
2 2 9,04 44,90 1,67 46,40
4 0 11,33 72,81 100,00 60,00
4 1 8,49 50,45 10,95 44,80
4 2 7,75 69,46 4,29 55,20
50 10 2 0 27,96 21,48 100,00 14,00
2 1 17,50 29,38 21,89 23,20
2 2 16,81 27,04 3,97 24,80
4 0 17,57 50,79 100,00 34,00
4 1 13,07 48,65 34,63 31,20
4 2 11,84 56,59 11,23 36,40
50 20 2 0 28,04 29,06 100,00 24,00
2 1 17,16 37,53 10,82 24,00
2 2 16,32 40,80 1,49 32,00
4 0 19,06 85,18 100,00 41,60
4 1 14,05 83,21 35,30 44,80
4 2 12,29 129,49 7,67 48,40
75 10 2 0 40,39 53,93 100,00 9,60
2 1 21,64 31,03 21,85 16,53
2 2 20,72 29,13 9,70 18,40
4 0 23,68 33,37 100,00 20,27
4 1 14,97 41,74 32,25 25,87
4 2 14,04 51,25 12,01 29,33
75 20 2 0 38,99 39,29 100,00 15,47
2 1 24,85 34,48 28,59 23,73
2 2 23,39 42,19 5,92 26,67
4 0 25,59 65,28 100,00 31,20
4 1 18,66 99,52 35,60 34,93
4 2 17,13 111,95 8,48 39,73
5 Conclusion
In this paper, we introduced a new concept in last-mile logistics, where a truck

is assisted by a mixed fleet of drones and ATVs. From different practical points
of view, this concept is meaningful and worthily to be elaborated. Furthermore,
we designed a Genetic Algorithm and solved the introduced problem for small-
and medium-sized instances. Our numerical results highlight advantages of using
combination of drones and ATVs for delivering parcels.
Exploring mathematical properties of the problem, proposing its mathemati-

cal programming formulations, and design of efficient exact as well as alternative
heuristic solution methods are promising future research directions.
Acknowledgment. The authors would like to acknowledge the Technische Univer-

sität Kaiserslautern (Germany) for the financial support through the research program
“Forschungsförderung des TU Nachwuchsringes”.
References
1. Bouman, P., Agatz, N., Schmidt, M.: Dynamic programming approaches for the
traveling salesman problem with drone. Networks 72(4), 528–542 (2018)
2. Boysen, N., Schwerdfeger, S., Weidinger, F.: Scheduling last-mile deliveries with
truck-based autonomous robots. Eur. J. Oper. Res. 271(3), 1085–1099 (2018)
3. Carlsson, J.G., Song, S.: Coordinated logistics with a truck and a drone. Manag.
Sci. 64(9), 4052–4069 (2018)
4. Coffman, E., Garey, M., Johnson, D.: An application of bin-packing to multipro-
cessor scheduling. SIAM J. Comput. 7(1), 1–17 (1978)
5. Deutsche Post DHL Group: DHL Parcelcopter. Press Kit. https://www.dpdhl.
com/en/media-relations/specials/dhl-parcelcopter.html (2019)
6. Reeves, C.R.: Genetic algorithms. In: Gendreau, M., Potvin, J.-Y. (eds.) Handbook
of Metaheuristics, pp. 109–139. Springer, New York (2010). https://doi.org/10.
1007/978-1-4419-1665-5 5
7. Ha, Q.M., Deville, Y., Pham, Q.D., Hà, M.H.: On the min-cost traveling salesman
problem with drone. Transp. Res. Part C 86, 597–621 (2018)
8. Lin, S., Kernighan, B.W.: An effective heuristic algorithm for the traveling-
salesman problem. Oper. Res. 21(2), 498–516 (1973)
9. Murray, C.C., Chu, A.G.: The flying sidekick traveling salesman problem: opti-
mization of drone-assisted parcel delivery. Transp. Res. Part C 54, 86–109 (2015)
10. Otto, A., Agatz, N., Campbell, J., Golden, B., Pesch, E.: Optimization approaches
for civil applications of unmanned aerial vehicles (UAVs) or aerial drones: a survey.
Networks 72(4), 411–458 (2018)
11. Poikonen, S., Wang, X., Golden, B.: The vehicle routing problem with drones:
extended models and connections. Networks 70(1), 34–43 (2017)
12. Salewski, H., Moeini, M.: Instances for the truck-drone-ATV routing problem.
https://doi.org/10.5281/zenodo.2600809
13. Schermer, D., Moeini, M., Wendt, O.: Algorithms for solving the vehicle routing
problem with drones. Lect. Notes Artif. Intell. 10751, 352–361 (2018)
14. Schermer, D., Moeini, M., Wendt, O.: A Variable Neighborhood Search Algorithm
for Solving the Vehicle Routing Problem with Drones, pp. 1–33. Technical Report,
Technische Universität Kaiserslautern (2018)
15. Wang, X., Poikonen, S., Golden, B.: The vehicle routing problem with drones:
several worst-case results. Optim. Lett. 11(4), 679–697 (2016)
A Planning Problem with Resource
Constraints in Health Simulation Center
Simon Caillard1,2(B) , Laure Brisoux Devendeville1 , and Corinne Lucet1

1
Laboratoire MIS (EA 4290), Université de Picardie Jules Verne,
33 Rue Saint -Leu, 80039 Amiens Cedex 1, France
{laure.devendeville,corinne.lucet,simon.caillard}@u-picardie.fr
2
Health Simulation Center SimUSanté
R
, Amiens University Hospital,
Amiens, France
simon.caillard@chu-amiens.fr
Abstract. The Health Simulation Center SimUSanté performs training

sessions for several different healthcare actors. This paper presents in
detail the planning problem concerning time and resources encountered
by SimUSanté and offers a greedy algorithm SimU G, to solve it. New
instances stemmed from the Curriculum-Based Courses Timetabling
Problem (CB-CTT) are generated, in order to test SimU G on repre-
sentative instances. The computational results, which are performed and
described, show that SimU G reaches optimality for few instances and
provides a solution with a gap inferior to 6% for the others.
Keywords: Scheduling · Healthcare training · Timetabling ·

Operational research · Optimization
1 Introduction
In recent years, research into development of scheduling solutions in the health-

care sector has become increasingly important. These researches discuss prob-
lems related to patient service quality, optimisation of resource management and,
as in our case, training time for health professionals. Indeed, the training center
SimUSanté located in Amiens, France, is one of the biggest multidisciplinary
active pedagogy centers in Europe. This center is used by all kinds of health
actors: professionals, students, patients and carers. The aim of this center is to
provide a space where all of its actors can meet and train together by simu-
lating medical acts in various fields of healthcare (such as surgical operations,
blood sampling, cardiopulmonary resuscitation, etc.), but also attending regular
courses. Thus, different variables such as the number of activities, of resources,
of employees, their skills, and the number of operating rules might become a
problem when constructing a coherent and effective schedule.
This project is supported by region Hauts-de-France and Health Simulation Center

SimUSanté.
https://doi.org/10.1007/978-3-030-21803-4_102
1034 S. Caillard et al.
A wide variety of articles related to educational scheduling problems have

been published, ranging from Examination Timetabling Problems to Univer-
sity Courses Timetabling problems [9]. All these problems are NP-Complete [6]
and the problem SimUSanté is faced with belonging to this family of prob-
lems. However, in this case, it is more specifically related to Curriculum-Based
Courses Timetabling Problem (CB-CTT) [7]. CB-CTT consists of finding the
best weekly assignment for university lectures, available rooms and time periods
for a set of classes. A feasible assignment satisfies a set of hard constraints and
the objective function takes into account penalties related to a set of violated soft
constraints. However, the case of SimUSanté differs from this type of problems:
In SimUSanté, there are no periodic schedules, no soft constraints but we need to
consider lectures with room and time constraints. Moreover, we also need to con-
sider different types of resources (rooms have specific characteristics like surgical
block, pharmacy, etc.), the different types of skills needed to perform lectures.
Another way of approaching the problem would be to consider the CB-CTT as a
variant of the Resource-Constrained Project Scheduling Problem (RCPSP) [4].
Lectures would correspond to activities, employees (teachers, laboratory prepar-
ers, etc.) and rooms with their characteristics would be considered as resources.
RCPSP formalization and its related problems allow to take into account the
specificities of our case. For example, the Multi-Skills Project Scheduling Prob-
lem [2] has the skill notion for employees and the Resources Constrained Multi
Projects Scheduling Problem [3], the notion of multiple projects to plan.
The purpose of our collaboration with the SimUSanté center is to study
possible solutions to management and planning problems they encounter in all
their activities. First, we used a mathematical model related to the RCPSP
model presented in [8] and we offered a greedy algorithm SimU G. Since no
real benchmark exists to test our algorithm SimU G, we have generated ade-
quate instances [5] inspired by those used in the Curriculum-Based Courses
Timetabling problem [1]. We compared SimU G with the results worked out by
the mathematical model implemented in CPLEX [8].
The paper is organized as follows. In Sect. 2, we describe and formalize the
scheduling problem encountered by SimUSanté. In Sect. 3 we present our greedy
algorithm SimU G and give the different choosing criteria of the construction
process. Section 4 introduces the generated instances and provides computational
results. Section 5 concludes this paper with some final remarks and perspectives.
2 SimUSanté: A Planning Problem with Resource

Constraints
In such a large center of multidisciplinary active pedagogy as SimUSanté, there
are a lot of activities, grouped into sessions, with some precedence constraints,
specific skills requirements and specific room equipments. Moreover, due to the
fact that SimUSanté is a simulation center, the room facilities are both varied
and flexible. Planning all activities while respecting the various constraints is
an issue of primary importance for the proper functioning and success of such
A Planning Problem with Resource Constraints in Health Simulation Center 1035
a center. We detail below the different elements of this planning problem, taken
into consideration in our study.
Horizon: The used horizon H is one week decomposed in working days. Let
D be the set of these working days and ∀d ∈ D, we denote Td the set of
time slots of the day d and T = d∈D Td . Each time slot represents 1 h. Let
breakd be a subset of slots identified as potential time breaks, for day d. One
at least of these time slots should be idle to ensure the existence of a daily
lunch break for any sessions.
Resources: We have a finite set of resources R = Rr ∪ Re with Rr the set
of rooms and Re the set of employees. To Re is associated a set of types
Λe = {λ1 , ..., λ|Λe | } which corresponds for example to the skills of employees.
To Rr is also associated a set of types Λr = {λ|Λe |+1 , ..., λ|Λe |+|Λr | } which
corresponds for example, to specific room equipments. We denote Λ = Λr ∪Λe .
Each resource can have more than one associated type. For example, a room
may be equipped with artificial arms for simulation of taking blood, but also
with artificial vertebral columns for the simulation of lumbar punctures. We
denote qtavλt i quantity of resource λi available at time slot t. All activities
scheduled at time slot t cannot use more than the available resources. We
also take into account the availabilities of employees.
Activities: Let A be the set of activities. Each activity a ∈ A is characterized
by duration durationa , an earliest starting date ESa and a latest starting
date LSa . qtreqλai is the quantity of resource of type λi , ∀i = 1, .., |Λ| required
by activity a, and Λa = {λi ∈ Λ / qtreqλai = 0}, the set of resource types
required by a. A precedence relation is defined between the activities and we
denote preda the set of activities that must be planned before activity a.
Training session: Let S be the set of training sessions to scheduled over hori-
zon H. Each training session s ∈ S is composed of a set of activities As , and
Λs = a∈As Λa , gives resource types required by training session s. The oper-
ating rules relative to the activities of a given session s are that all activities
in As must be planned not at the same time; activities are not preemptive.
Constructing a solution is to assign a start date ta (a time slot) to activity

a, and a set of resources Ra , such that all the constraints of resources (number
and type), of precedences and of operating rules are respected. A solution is
represented by a set Sol of triplets (a, ta , Ra ), with a ∈ A, ESa ≤ ta ≤ LSa
and Ra ⊆ Rr ∪ Re . Ra is a set of available resources assigned to a, which
exactly matches the resources required to execute a. Let Sol+ ⊆ A, Sol+ =
{a/(a, ta , Ra ) ∈ Sol} be the set of activities that have been effectively planned.
Sol− = A \ Sol+ , represents the set of unscheduled activities. The start date
ta respects all the time constraints relating to a. This includes the following
constraints:
– precedences: ∀a ∈ Sol+ , ∀a ∈ preda , ta + durationa ≤ ta

– session activities: ∀s ∈ S, ∀a ∈ As ∩ Sol+ , ∀a ∈ As ∩ Sol+ \ {a},
{ta , . . . , ta + durationa } ∩ {ta , . . . , ta + durationa } = ∅

– lunch breaks: ∀a ∈ Sol+ , if ta ∈ Td ,
{ta + 1, . . . , ta + durationa − 1} ∩ breakd = ∅.
For a given session s ∈ S, if there is at least one activity a ∈ As , which has

been scheduled (As ∩Sol+ = ∅), then the start date tstarts and the end date tends
are computed by Eq. 1, and its corresponding makespan makespans is defined
by Eq. 2. If no activity has been scheduled makespans = 0.
tstarts = min{ta , a ∈ As ∩ Sol+ }, tends = max{ta + durationa , a ∈ As ∩ Sol+ }

(1)
makespans = tends − tstarts (2)

The evaluation associated with each solution Sol, denoted M akespan(Sol), is
the sum of the makespans of all sessions, plus the amount of unplanned activities
for that session, multiplied by penalty α (see Eq. 3). The objective is to find a
valid solution with a minimum M akespan.

M akespan(Sol) = makespans + |Sol− | × α (3)
s∈S
3 SimUG: A Greedy Algorithm
The main idea of greedy algorithm SimU G (described in Algorithm 1) is to

plan one by one the sessions in S, by scheduling one by one its activities, while
minimizing the M akespan.
The greedy algorithm selects, thanks to function sessionChoice(), training
session s∗ to schedule and, removes it from S. Then, function f indBetterStart()
returns the best time slot t∗ to start s∗ , that means the time slot from which
the predictive makespan will be as compact as possible. Once t∗ is chosen, ∀t ∈
T, t∗ ≤ t, the list of eligible activities of s∗ that can start at t is computed by
function eligibleActivities(). An activity is eligible if all its demands can be
satisfied over its duration.
Among the set of eligible activities function activityChoice() chooses which
activity a∗ , with its pre-assigned resources Ra∗ , will be scheduled. Its start date
ta∗ is set to t and resources Ra∗ are updated by function updateAvailability().
Breaks are taken into account in function EligibleActivities() and through
the incrementation of t. This part is not detailed to facilitate the understanding
of the algorithm. Note that, returned solution Sol could be imcomplete since
some activities could be unscheduled.
Algorithm 1 SimU G
Require: S (set of unscheduled training sessions), T (set of time slots)
Ensure: Sol (a feasible solution), Sol− (set of unscheduled activities)
1: Sol ← ∅
2: Sol− ← ∅
3: while S = ∅ do
4: s∗ ← sessionChoice(S)
5: S ← S \ {s∗ }
6: t∗ ← f indBetterStart(s∗ , T )
7: t ← t∗
8: U As ← As
9: while (t ≤ |T |) ∧ (U As = ∅) do
10: EA ← eligibleActivities(As , t)
11: if EA = ∅ then
12: (a∗ , Ra∗ ) ← activityChoice(EA, t)
13: Sol ← Sol ∪ (a∗ , t, Ra∗ )
14: updateAvalaibility(a∗ , t, Ra∗ )
15: U As ← U As \ {a∗ }
16: t ← t + durationa∗
17: else
18: t←t+1
19: end if
20: end while
21: Sol− ← Sol− ∪ U As
22: end while
23: return (Sol, Sol− )
3.1 Function sessionChoice()
The aim of sessionChoice() is to choose among unscheduled sessions the next to

plan. It selects the training session with the longest duration because the longer
a training session is, the bigger the probability to have idle time slots is.
The result of sessionChoice() is s∗ , computed by Eq. (4). If several training
sessions have the same duration, the first one is chosen.

s∗ = argmax durationa (4)
s∈S
a∈As
3.2 Function f indBetterStart()
Function f indBetterStart() computes the best start date for session s∗ . In order
to find this date, we need to compute for s∗ and for any time slot t, the earliest
end date endt of s∗ . To compute it, we relax all resource constraints.
For a given start date t and for s∗ , Eq. 5 computes endt , where break() is
a function that gives the number of time slots for lunch breaks required by the
operating rules.

endt = t + durationa + break(t, durationa ) (5)
a∈As∗ a∈As∗
Function f indBetterStarts() scores each time slot by computing resource

deficiencies that could appear if s∗ starts at t. Because no activity order could
be taken into account at this step of the process, we overestimate the resource
requirements on each time slot over [t, endt ]. Overestimating resource require-
ments consists in considering that session s∗ requires during its progress the
maximum resource quantity over As∗ , for all resource types λi ∈ Λs∗ .
∗
For each λi ∈ Λs∗ , let qtsλi = max (qtreqλai ) be the maximal quantity in
a∈As∗
resource type λi that is required by s∗ over all its activities. Deficiencies for
a resource type λi ∈ Λ at time slot t is the difference between the maximum
∗
demand qtsλi and qtavλt i .
Resource deficiencies associated to t, denoted Dt and detailed in Eq. (6), is
the sum of all deficiencies for each resource type involved in Λs∗ over [t, tend ].
t
end
Dt = max(qtsλi − qtavλt i , 0) (6)
t =t λi ∈Λs
If two time slots have the same score, we choose time slot t that maximizes the
sum of available resources over [t, endt ]. The more resources are still, the more
opportunities there are to plan remaining activities. The available resources score
of time slot t is given by Eq. (7).
t
end
availt = (qtavλt i ) (7)
t =t λi ∈Λs
The best starting time slot t∗ (see Eq. (8)) for training session s∗ is then the
time slot with the smallest resource deficiencies Dt , and as second criterion, the
biggest resource availability availt .
t∗ = [ argmin {Dt }; argmax {availt } ] (8)

t∈T t∈T
3.3 Function: eligibleActivities()
For a given set As∗ of unscheduled activities for session s∗ , and a time slot t,
eligibleActivities() computes set EA of couples (a, Ra ), where a is an activity
that could start at time slot t, with pre-assigned set of resources Ra .
Activity a can start at t if there are enough resources continuously available
on the period [t, t + durationa ] and ESa ≤ t ≤ LSa . Moreover, ∀a ∈ preda , a is
already scheduled and ta + durationa < t. We note that an activity can require
several different types of resources and, a resource can be of several types.
3.4 Function: activityChoice()
From time slot t, list EA of couples (a, Ra ), activityChoice() selects according

to three criteria the best activity a∗ to schedule at t.
Remaining resources: The first criterion remainat , computed by Eq. (9), is a

score that characterizes the most critical resource type for activity a and time
slot t.
t
end

remainta = min ( qtavλt i − qtreqλai .durationa ) (9)
λi ∈Λa
t =t
Because a is eligible on t, all its required resources are available on [t, t +

durationa ], i.e. ∀λi ∈ Λa ∀t ∈ [t, t + durationa ] qtavλt i ≥ qtreqλai . The cri-
terion remainta estimates the emergency to plan a on t. Let us note that if
remainta = 0, there is no other possibility than t to plan a on [t, endt ] without
increasing makespans∗ .
Activity duration: The second criterion is the duration of activities.
Employee timetable compactness: This last criterion is used to limit idle
time slots in employee timetables. An idle time slot for employee e is a time
slot between two working periods in a same day, outside lunch breaks.
The criterion compactta , estimates the impact of scheduling a on t, in terms
of idle time slots, for all employees e ∈ Rae = Re ∩ Ra . Let Idlee be the
current number of idle time slots associated with employee e ∈ Rae , and Idleea
the predicted number of idle time slots for e if a is scheduled on t. Then the
criterion compactta is defined in Eq. (10). The greater compactta is, the more
compact timetable of employees is.

Idlee
e
e∈Ra
compactta = (10)
1+( Idleea )
e
e∈Ra
The best activity a∗ to scheduled on t is selected according to these criteria

as mentioned in Eq. (11).
a∗ = [ argmin {remainta }; argmax {durationa }; argmax {compactta }]

a|(a,Ra )∈EA a|(a,Ra )∈EA a|(a,Ra )∈EA
(11)
3.5 Function: updateAvailability
When activity a∗ is scheduled on t, all resources in Ra∗ are set unavailable over
the period [t, t + durationa∗ ]. Let us note that precedence constraints are also
updated for all activities linked to a∗ .
4 Instances and Computational Results

4.1 Description of Instances
In order to test the SimU G algorithm, we needed instances close to the SimU-
Santé problem. However, the current operation of the training center does not
provide real instances. For this reason, we have generated new instances, from
the classic instances of the CB-CTT problem [1].
CB-CTT Instances.These instances are structured as follows:

– C is a set of classes, and each c ∈ C follow a set of events Ωc . c∈C Ωc = Ω.
– Each event ω ∈ Ω has a duration durationω and a pre-assigned teacher.
– Each event ω belongs to a course K.
– No precedence constraints link events.
– Rr is a finite set of rooms. Only one type of rooms.
– Re is a finite set of teachers. Each teatcher is pre-assigned to one event.
From a given CB-CTT instance CB, we firstly generate an initial instance
D0 T0 C0 and then generate from it, different instances with different resource
characteristics (multi type rooms, multi skill and precedence constraints).
Generated Instances. The initial D0 T0 C0 instance has all the characteristics
of the SimUSanté problem, described in Sect. 2, and is constructed as follows:
– The horizon H is composed of six days, with nine time slots per day.
– S is the set of session,then S = C. Each class c corresponds to a session s.
– ∀s ∈ S, As = Ωc and s∈S As = A. Each event ω is equal to an activity a.
– ∀ω ∈ Ω, we define a type of employee λω , such that if a corresponds to ω,
then durationa = durationω and qtreqλaω = 1.
– ∀(a, a ) ∈ A2 such that the corresponding events (ω, ω ) ∈ Ω 2 are in a same
course K, we generate a precedence relation between a and a .
r r
– R = R is the set of rooms. There is only one type of rooms λα .
– Re = Re is the set of employees. If event ω was pre-assigned to employee e,
then e is associated to λω . All employees are available over H.
From D0 T0 C0 instance, a set of SimUSanté instances are generated, varying
the following criteria: availability of employees (D1 ), types of rooms (T1 ), and
types of employees (C1 ).
criterion D1 : replaces full availability by patterns randomly chosen to 20%
randomly selected employees.
criterion T1 : adds a second type of room λβ . Then, 20% of the rooms are
randomly selected, their type is set to λβ . Finally, 20% of the activities are
randomly selected and their required type is set to λβ .
criterion C1 : adds types to 20% of ramdomly selected employees.
These criteria are combined to provide different instances. As an illustration,
D0 T0 C0 + D1 provides a new instance D1 T0 C0 and D1 T0 C0 + T1 gives D1 T1 C0 ,
etc. Because the criteria involve random choices, it is possible to generate several
different instances from a single variant criterion. Thus, from D0 T0 C0 +D1 , three
instances D1 T0 C0 -Gen1, D1 T0 C0 -Gen2, D1 T0 C0 -Gen3 are generated. For other
generations ditto. These new instances are available in [5].
Table 1. Results for Brazil1 and Italy1
Instance Brazil1 Instance Italy1

Instance name cplex SimUG Gap Instance name cplex SimUG Gap
D0 T0 C0 84 84 0.00% D0 T0 C0 113 114 0.88%
D1 T0 C0 - Gen1 84 89 5.95% D1 T0 C0 - Gen1 113 116 2,65%
D1 T0 C0 - Gen2 84 89 5.95% D1 T0 C0 - Gen2 113 116 2.65%
D1 T0 C0 - Gen3 84 89 5.95% D1 T0 C0 - Gen3 113 116 2.65%
D0 T1 C0 - Gen1 84 87 3.57% D0 T1 C0 - Gen1 113 117 3.54%
D0 T1 C0 - Gen2 84 87 3.57% D0 T1 C0 - Gen2 113 117 3.54%
D0 T1 C0 - Gen3 84 87 3.57% D0 T1 C0 - Gen3 113 117 3.54%
D0 T0 C1 - Gen1 84 84 0.00% D0 T0 C1 - Gen1 113 114 0.88%
D0 T0 C1 - Gen2 84 84 0.00% D0 T0 C1 - Gen2 113 114 0.88%
D0 T0 C1 - Gen3 84 84 0.00% D0 T0 C1 - Gen3 113 114 0.88%
D1 T1 C0 - Gen1 87 91 4.60% D1 T1 C0 - Gen1 116 120 3.45%
D1 T1 C0 - Gen2 87 91 4.60% D1 T1 C0 - Gen2 116 120 3.45%
D1 T1 C0 - Gen3 87 91 4.60% D1 T1 C0 - Gen3 116 120 3.45%
D1 T0 C1 - Gen1 84 87 3.57% D1 T0 C1 - Gen1 113 116 2.65%
D1 T0 C1 - Gen2 84 87 3.57% D1 T0 C1 - Gen2 113 116 2.65%
D1 T0 C1 - Gen3 84 87 3.57% D1 T0 C1 - Gen3 113 116 2.65%
D0 T1 C1 - Gen1 84 87 3.57% D0 T1 C1 - Gen1 113 116 2.65%
D0 T1 C1 - Gen2 84 87 3.57% D0 T1 C1 - Gen2 113 116 2.65%
D0 T1 C1 - Gen3 84 87 3.57% D0 T1 C1 - Gen3 113 116 2.65%
D1 T1 C1 - Gen1 84 87 3.57% D1 T1 C1 - Gen1 113 117 3.54%
D1 T1 C1 - Gen2 84 87 3.57% D1 T1 C1 - Gen2 113 117 3.54%
D1 T1 C1 - Gen3 84 87 3.57% D1 T1 C1 - Gen3 113 117 3.54%
4.2 Computational Results

To test SimU G we use 2 CB-CTT instances: Brazil1 and Italy1. They are mod-
ified as explained in subsection 4.1. Their characteristics are:
Brazil1: 3 training sessions, 8 employees, 3 rooms, 27 activities and a total
duration of 75 time slots
Italy1: 3 training sessions, 13 employees, 3 rooms, 39 activities and a total
duration of 92 time slots
We have implemented a mathematical model under CPLEX. We use CPLEX
internal measurement called tick, which measures the effective use time regard-
less of the processor workload. Nevertheless, to find an optimal solution in the
tested instances, the computation time is greater than two hours.
SimU G was implemented in Java, on Intel i7 7500U processor. The time
used to find solutions is always less than one second.
Table 1 presents the comparison between CPLEX and SimU G (with penaltiy
α set to |T |). For each instance, column CPLEX is the optimal makespan Mcplex ,
column SimU G is the makespan MSimU G worked out by our algorithm and the
last column Gap is computed by (MSimU G − Mcplex )/Mcplex .
SimU G achieved the optimal solution for D0 T0 C0 and D0 T0 C1 Brazil1 family
instances. Moreover the gap is always less than 6% in the other cases. We can
observe that the generated instance optimality is always the same, except for
D1 T1 C0 family. Indeed, for these instances, availabilities of employees and rooms
are reduced without compensation of the increasing number of types associated
to employees (see D1 T1 C1 family).
In this paper we have presented a first study of a planning problem with resource
constraints, for the health training center SimUSanté. We proposed a greedy
algorithm SimU G, based on a set of choice criteria aimed at reducing the over-
all Makespan of training sessions, while respecting all resource and time con-
straints. We experimented SimU G on new instances, generated from those of
CB-CTT, and integrating the SimUSanté problem characteristics. The results
obtained were compared to the optimal solutions provided by the CPLEX solver.
Optimality was reached for few instances, but for the others the gap was still
below 6%. SimU G produced a suitable basic solution that we plan to use in a
genetic algorithm, which is the focus of our current research.
References
1. High School Timetabling Project. https://www.utwente.nl/en/eemcs/dmmp/hstt/
2. Bellenguez-Morineau, O., Neron, E.: A branch and bound method for solving multi-
skill project scheduling. RAIRO Op. 41, 155–170 (2007)
3. Browning, T.R., Yassine, A.A.: Resource-constrained multi-project scheduling: pri-
ority rules. Int. J. Prod. Econ. 126, 212–228 (2010)
4. Brucker, P., Knust, S.: Resource-constrained project scheduling and timetabling. In:
PATAT 2000, LNCS 2079. Springer, Heidelberg (2001)
5. Caillard, S., Brisoux-Devendeville, L., Lucet, C.: Health simulation center simu-
santéR
’s Problem Benchmarks. https://mis.u-picardie.fr/en/Benchmarks-GOC/
6. Cooper, T.B., Kingston, J.H.: The Complexity of Timetable Construction Problems.
Springer, Heidelberg (1995)
7. Di Gaspero, L., McCollum, B., Schaerf, A.: Curriculum-based CTT - Technical
Report. The Second International Timetabling Competition (ITC-2007)
8. Pritsker, A.A.B., Watters, L.J., Wolfe, P.M.: Multiproject scheduling with limited
resources: a zero-one programming approach. Manag. Sci. 16(1), 93–108 (1969).
http://www.jstor.org/stable/2628369
9. Schaerf, A.: A survey of automated timetabling. Artif. Intell. Rev. 87–127 (1999)
Edges Elimination for Traveling Salesman
Problem Based on Frequency K5 s
Yong Wang(B)
North China Electric Power University, Beijing 102206, China

yongwang@ncepu.edu.cn
Abstract. We eliminate the useless edges for traveling salesman prob-

lem (T SP ) based on frequency K5 s. A frequency K5 is computed with
ten optimal five-vertex paths with given endpoints in a corresponding
K5 in Kn . A binomial distribution model is built based on frequency
K5 s. As the frequency of each edge is computed with N frequency K5 s,
the binomial distribution demonstrates that the frequency of an opti-
mal Hamiltonian cycle edge is bigger than 4N on average. Thus, one
can eliminate the edges with frequency below 4N to reduce the number
of concerned edges for resolving T SP . A heuristic algorithm is given to
eliminate the useless edges. After many useless edges are cut, the com-
putation time of algorithms for T SP will be considerably reduced.
Keywords: Traveling salesman problem · Frequency K5 · Binomial

distribution · Heuristic algorithm
1 Introduction
Given Kn on n vertices {1, . . . , n}, there is a distance function d(x, y) =
d(y, x) > 0 for any x, y ∈ {1, . . . , n} and x = y. A salesman wants to find
a permutation σ = (σ1 , . . . , σn ) of 1, . . . , n such that σ1 = 1 and d(σ) :=
n−1
d(σn , 1) + i=1 d(σi , σi+1 ) is as small as possible. This is the symmetric travel-
ing salesman problem (T SP ). Due to its theoretical values and wide applications
in engineering, T SP has been extensively studied to find efficient algorithms for
searching either an optimal Hamiltonian cycle (OHC), or an approximate solu-
tion which is a Hamiltonian cycle, i.e. a permutation τ such that d(τ ) ≤ cd(σ)
where σ is the OHC and c is some constant. There are a number of special classes
of graphs where one can find the OHC in a reasonable computation time, see [1].
Karp [2] has shown that T SP is N P -complete. This means that there are no
exact polynomial-time algorithms for T SP unless P = N P . The computation
time of exact algorithms is O(an ) for some a > 1 for general T SP . For example,
We acknowledge W. Cook, H. Mittelmann who created the Concorde and G. Reinelt
et al. who provide the T SP data to TSPLIB. The authors acknowledge the funds
supported by the Fundamental Research Funds for the Central Universities (No.
2018MS039 and No. 2018ZD09).
https://doi.org/10.1007/978-3-030-21803-4_103
1044 Y. Wang
Held and Karp [3], and independently Bellman [4] gave a dynamic programming
approach that required O(n2 2n ) time. Integer programming techniques, such as
either branch and bound [5] or cutting-plane [6], are able to solve T SP examples
on thousand points. In 2006, a VLSI application with 85,900 points has been
solved with an improved cutting-plane method on a computer system with 128
nodes [6]. The experiments showed the computation time of the exact algorithms
was hard to reduce for large T SP instances.
On the other hand, the computation time of approximation algorithms and
heuristics has been significantly decreased. For example, the MST-based algo-
rithm and Christofides’ algorithm [7] can find the 2-approximation and 1.5-
approximation in time O(n2 ) and O(n3 ), respectively, for metric T SP . For
graphic T SP , Mömke and Svensson [8] gave a 1.461-approximation algorithm
with respect to Held-Karp lower bound. In most cases, the Lin-Kernighan heuris-
tics (LKH) can generate the “high quality” solutions within 2% of the optimum
in nearly O(n2.2 ) time [9]. However, these approximation algorithms and heuris-
tics can not guarantee to find an OHC.
In recent years, researchers have developed polynomial-time algorithms to
resolve the T SP on sparse graphs. In sparse graphs, the number of Hamiltonian
cycles (HC) is greatly reduced. For example, Sharir and Welzl [10] proved that in
a sparse graph of average degree d, the number of HCs is less than e∗ ( d2 )n where
e∗ is the base of natural logarithm. In addition, Björklund [11] proved that T SP
on bounded degree graphs can be solved in time O(2 − )n , where depends
on the maximum vertex degree. For T SP on cubic connected graphs, Correa,
Larré and Soto [12] proved that the approximation threshold is strictly below 43 .
For T SP on bounded-genus graphs, Borradaile, Demaine and Tazari [13] gave a
polynomial-time approximation scheme. In the case of asymmetric T SP , Gharan
and Saberi [14] designed the constant factor approximation algorithms. For T SP
on planar graphs, the constant factor is 22.51(1+ n1 ). Thus, whether one is trying
to find exact solutions or approximate solutions to the T SP , one can has variety
of more efficient algorithms available if one can reduce a given T SP to finding
an OHC in a sparse graph.
Based on 2 − opt move, Jonker and Volgenant [15] found many useless
edges out of OHC. After these edges were trimmed, the computation time of
branch-and-bound for certain T SP instances was reduced to half. Hougardy and
Schroeder [16] eliminated the useless edges with a combinatorial algorithm based
on 3 − opt move. The combinatorial algorithm eliminates more useless edges for
T SP instances in T SP LIB and the computation time of Concorde Package was
reduced by more than 11 times for certain big T SP instances. Besides the accel-
eration to the exact algorithms for T SP , the candidate edges in a sparse graph is
helpful for the local search solver LKH to detect the high quality solutions quite
efficiently [17]. Different from the above research, we eliminate the useless edges
for T SP according to frequencies of edges computed with frequency quadrilat-
erals [18,19] and optimal four-vertex paths [20]. As the frequencies of edges are
computed with either frequency quadrilaterals or optimal 4-vertex paths, the
frequencies of OHC edges are generally much bigger than those of most of the
Edges Elimination for Traveling Salesman Problem Based on Frequency K5 s 1045
other edges. As the minimum frequency of the OHC edges is taken as a fre-
quency threshold to cut the other edges, the experiments displayed that a sparse
graph with O(nlog2 (n)) edges are obtained for most T SP instances.
In this paper, the frequency K5 s are presented and a binomial distribution
model based on frequency K5 s is built. According to the binomial distribution,
one can eliminate half edges with small frequencies and OHC edges are preserved
with a big probability. If each edge is contained in enough number of K5 s, the
edges elimination can be repeated until there are seldom K5 s. In this case, the
Kn of T SP is converted into a sparse graph. The sparse graphs generally have
O(|V |) or O(|V |ln(|V |)) edges. In addition, if the resulting graph has bounded
degree (genus) or is planar or k-edge connected, then we have even more efficient
algorithms available to find exact or approximate solutions to the T SP .
The outline of this paper is given as follows. In Sect. 2, frequency K5 s will
be given and a probability model is built for cultivating OHC edges. In Sect. 3,
a binomial distribution model based on frequency K5 s is introduced. In Sect. 4,
a heuristic algorithm is designed to trim many useless edges. In Sect. 5, we shall
do experiments to cut edges for four types of T SP instances. Conclusions are
drawn in the last section.
2 The Frequency K5 s and a Probability Model

2.1 The Frequency K5 s
Given five vertices {A, B, C, D, E} in Kn , they compose a K5 shown in Fig. 1(a).
We assume ABCDE contains ten optimal five-vertex paths (OP 5 ) with given
endpoints. An OP 5 with given endpoints, such as A and B, is computed as
follows. Fixing endpoints A and B, there are six five-vertex paths containing
the five vertices. Among the six paths, the shortest one is taken as the optimal
five-vertex path for A and B. Since there are ten pairs of endpoints, K5 contains
ten OP 5 s. The frequency K5 is computed with the ten OP 5 s. The frequency of
each edge e is the number of OP 5 s containing e. The total frequency of all edges
enumerated from the ten OP 5 s is 40. The maximum, average and minimum
frequency of e are 9, 4 and 0, respectively. In addition, the total frequency of
four edges containing a vertex v ∈ {A, B, C, D, E} is equal to 16.
A A A
7 7 5 9
1 1 1 1
E 1 1
B E B E B
1 1 2 0
7 7 8 6
D C 7 7
D C D C
(a) (b) (c)
Fig. 1. A K5 (a) and two frequency K5 s (b) and (c)
In Fig. 1(a), we assume (A, B, C, D, E, A) is the OHC. It contains five

OP 5 s (A, B, C, D, E), (B, C, D, E, A), (C, D, E, A, B), (D, E, A, B, C) and
1046 Y. Wang
(E, A, B, C, D). Based on the five OP 5 s, the ten distance inequalities determined
by the edges’ distances are derived and shown in Table 1. Besides the five OP 5 s in
OHC, there are other five OP 5 s. The distance inequalities to compute the other
OP 5 s cannot violate the inequalities for computing the OP 5 s in OHC. For exam-
ple, one possible set of the other five OP 5 s are (A, B, E, D, C), (B, C, A, E, D),
(C, D, B, A, E), (D, E, C, B, A) and (E, A, D, C, B). The inequalities to compute
the five OP 5 s are not hard to derive. We neglect them for saving pages. The fre-
quency K5 computed with the ten OP 5 s is shown in Fig. 1(b). The numbers aside
the edges are their frequencies enumerated from the ten OP 5 s. The frequency
of each OHC edge is 7 and that of the other edges is 1. The frequency of OHC
edges is above 4 and the frequency of the other edges is below 4.
Table 1. The distance inequalities determined by OP 5 s in OHC in ABCDE
OP 5 s Distance inequalities
(A, B, C, D, E) d(A, B) + d(C, D) < d(A, C) + d(B, D)
(B, C, D, E, A) d(A, B) + d(D, E) < d(A, D) + d(B, E)
(C, D, E, A, B) d(A, E) + d(B, C) < d(A, C) + d(B, E)
(D, E, A, B, C) d(A, E) + d(C, D) < d(A, D) + d(C, E)
(E, A, B, C, D) d(B, C) + d(D, E) < d(B, D) + d(C, E)
d(A, B) + d(A, E) + d(C, D) < d(A, C) + d(A, D) + d(B, E)
d(A, B) + d(B, C) + d(D, E) < d(A, C) + d(B, D) + d(B, E)
d(A, B) + d(C, D) + d(D, E) < d(A, D) + d(B, D) + d(C, E)
d(A, E) + d(B, C) + d(C, D) < d(A, C) + d(B, D) + d(C, E)
d(A, E) + d(B, C) + d(D, E) < d(A, D) + d(B, E) + d(C, E)
It mentions that we will compare six five-vertex paths for computing an OP 5 .

There are possible 65 cases for the other 5 OP 5 s. Due to the constraints of the
five OP 5 s in OHC (see the inequalities in Table 1), the number of the other
OP 5 s is much smaller than 65 . Given a geometric pentagon ABCDE, the edges’
distances comply with d(A, B) = d(B, C) = d(C, D) = d(D, E) = d(E, A) and
d(A, C) = d(A, D) = d(B, D) = d(B, E) = d(C, E). Since the geometric pen-
tagon contains equal-weight edges, there are two OP 5 s for each pair of the end-
points {A, C}, {A, D}, {B, D}, {B, E} and {C, E}. For example for endpoints
A and C, the two paths (A, B, E, D, C) and (A, E, D, B, C) have equal length.
We derive the possible thirty-two frequency K5 s. There are total thirty-one fre-
quency K5 s because two frequency K5 s are identical. In each frequency K5 , the
five edges with frequencies above 4 belong to OHC. The frequencies of the other
edges are much smaller than 4. As K5 only contains ten OP 5 s, the frequencies
of OHC edges will be much bigger than those of the other edges. Furthermore,
the frequencies of the five OHC edges are nearly equal in different frequency
K5 s, and the frequencies of the other edges do not have much difference.
We assume
each K5 contains one OHC and ten OP 5 s. An edge e is contained
in n−23 K 5 s in Kn . If e belongs to OHC of K5 , its frequency will be above 4.
Otherwise, the frequency will be much smaller than 4. As we choose N frequency
K5 s containing e to compute its total frequency, the frequency of e will be nearly
equal based on different frequency K5 s. Thus, we choose one of the thirty-one
frequency K5 s as a standard model to compute the frequency of edges in Kn .
Among the thirty-one frequency K5 s, the set of frequencies 9, 8, 7, 6, 5, 2, 1, 1, 1, 0
occurs with the maximal time 10. One frequency K5 containing such frequency
set is shown in Fig. 1(c). We take the frequency K5 containing this frequency set
as the standard model to compute the frequency of each edge in Kn .
2.2 A Probability Model
Given a frequency K5 , the frequencies 9, 8, 7, 6, 5, 2, 1, 1, 1, 0 are distributed on

ten edges. The big frequencies 9, 8, 7, 6, 5 are given to OHC edges. Due to the
restrictions of OP 5 s, the frequencies will be selectly assigned to the ten edges.
For example in Fig. 1 (c), if (A, B) has frequency 9, the frequency of (B, C) must
be 5 or 6. If (B, C) has frequency 6, the frequencies of the other OHC edges
are determined. ABCDE contains 12 cycles and each cycle may be the OHC.
Moreover, each OHC edge may have one of the five OHC frequencies. Thus,
there will be 120 standard frequency K5 s for ABCDE. For edge e in ABCDE,
it is contained in six five-vertex cycles. Each of the frequencies 9, 8, 7, 6, 5, 2, 0
for e will occur twelve times in the 120 frequency K5 s. Each of the three 1s for
e also occurs twelve times. Let pi (e) denote the probability that e has frequency
i ∈ {0, 1, 2, 5, 6, 7, 8, 9} with respect to the 120 frequency K5 s. The probability
1 3
p0 (e) = p2 (e) = p5 (e) = p6 (e) = p7 (e) = p8 (e) = p9 (e) = 10 and p1 (e) = 10 . The
expected frequency of e is 4. The maximum and minimum frequency of e are 9
and 0, respectively. As we choose N frequency K5 s containing e to compute its
total frequency F (e), the expected frequency is 4N . The maximum and minimum
frequencies are 9N and 0, respectively.
For OHC edges eo in Kn , the probability will be different. Figure 2 shows the
OHC in Kn . We construct a K5 on vertices A, B, C, D, E for eo = (A, B) ∈ OHC
in Kn . In ABCDE, (C, D) ∈ OHC is a vertex-disjoint edge of (A, B). There
are n − 3 such edges (C, D) for (A, B) in OHC. E is another vertex different
from A, B, C, D in Kn . There are n − 4 such vertices E in Kn . In ABCD, the
distance inequality d(A, B) + d(C, D) < d(A, C) + d(B, D) holds. According to
the inequalities in Table 1, (A, B) and (C, D) will be OHC edges in ABCDE.
Thus, (A, B) will have one of thefrequencies
5, 6, 7, 8, 9 in frequency ABCDE.
Based on OHC in Kn , there are n−3 2 such K 5 s where eo = (A, B) is one OHC
edge. If we assume eo has the equal probability to have one of the frequencies
5, 6, 7, 8, 9 in each of the n−3 2 frequency K5 s, the average frequency of eo is 7
according to these frequency K5 s. Moreover, if eo has the equal probability as a
general edge e in the residual frequency K5 s, the frequency probability for eo is
computed as formula (1) and p1 (eo ) = 10 3
− 10(n−2)
9
. As one chooses N frequency

9
K5 s containing eo to compute F (eo ), F (eo ) = 4 + n−2 N on average. It is
1048 Y. Wang
A B
…
C
d(A,B)+d(C,D)<d(A,C)+d(B,D)
(A,B), (C,D) belong to OHC D
in ABCDE
…
OHC in Kn E
Fig. 2. A K5 on A, B, C, D, E constructed with OHC edges
obviously bigger than 4N for a general edge as N is big. Thus, one can eliminate
the edges with frequencies less than 4N and OHC will be kept intact.
1 3
p5 (eo ) = p6 (eo ) = p7 (eo ) = p8 (eo ) = p9 (eo ) = +
10 10(n − 2)
1 3
p0 (eo ) = p2 (eo ) = − (1)
10 10(n − 2)
3 The Binomial Distribution Model

Given a T SP instance, the probability pi (e) (i ∈ {0, 1, 2, 5, 6, 7, 8, 9}) is con-
cluded for each edge e. As we choose N frequency K5 s for e to compute F (e),
the edges with big probability pi (e) (i ≥ 5) will have a big F (e) whereas the
edges with a big probability pi (e) (i ≤ 2) will have a small F (e). Note a random
variable X as the number of frequency K5 s where e has the frequencies above 4
in N frequency K5 s. X will conform to a binomial distribution X ∼ B(N, p>4 (e))
where p>4 (e) = p5 (e) + p6 (e) + p7 (e) + p8 (e) + p9 (e). In such case, P (X = m)
that X = m is given by formula (2).

N
P (X = m) = (p>4 )m (1 − p>4 )N −m . (2)
m

For OHC edges eo , when m equals m0 = N p>4 = 12 + 2(n−2) 3

N , P (X =
m0 ) reaches the maximum value. As we select N frequency K5 s containing eo ,
there are above m0 frequency

K5 s where the frequency of eo is above 4. It
1 3
mentions that 2 + 2(n−2) N is a conservative lower bound for eo since we did
not take into account the other possible K5 s where the frequency of eo is above
4. For a non-OHC edge e, one can always find some frequency K5 s where e
has frequency 0, 1 or 2. For example (A, C) in Fig. 2, there are n − 4 frequency
ABCDEs in which it has frequency 0, 1 or 2 due to the distance inequality
d(A, B) + d(C, D) < d(A, C) + d(B, D). Therefore, the value of m0 is generally
a upper bound for most of the other edges.
In a K5 , half edges belong to OHC. It means half edges will have the fre-
quency above 4. For an edge e in Kn , the p>4 = 12 . As we choose N frequency
K5 s containing e, there will be N2 frequency K5 s where the frequency of e is
above and below 4, respectively. If we use 4 as a frequency threshold to trim e, e
will be eliminated N2 times. The probability 1
nthat e is eliminated is 2 according
to frequency threshold 4. Considering the 2 edges in Kn , half edges will be cut
according to 4. Since OHC edges generally have big frequencies based on formula
(1), they will be preserved as we delete the half edges with small frequencies.
After one round of edges elimination, a graph with 12 n2 edges is preserved. As
N is big enough for each edge in the preserved graph, we can compute each of
their frequencies with N frequency K5 s and eliminate another half edges with
small frequencies. This edges elimination can be iterated until N is small for
edges in some preserved graph. As N ≈ 0, the binomial distribution (2) does not
work well. In this case, we will compute a sparse graph for T SP , see the exper-
iments based on frequency quadrilaterals [20]. At the k th cycle, the number of
k n
preserved edges is 12 2 . Thus, the maximum iteration kmax = log 1
2
2 n−1
.
For an OHC edge, the p>4 > 12 based on formula (1). It means the probability
that eo is cut is less than 12 according to frequency threshold 4. As we trim half
edges with small frequencies, eo will be preserved with a big probability.
4 A Heuristic Algorithm
As the average frequency f¯(e) = FN(e) of edge e is computed with N frequency

K5 s, it is suggested to cut half edges with the smallest frequencies and OHC
edges are preserved with a big probability. As N is big enough, we can repeat the
edges elimination according to their f¯(e)s until a sparse graph is computed for
T SP . We have done experiments for certain T SP instances. At each iteration,
f¯(e) of each edge is computed with N frequency K5 s and half edges with the
smallest frequencies are cut. The results show the following shortcomings: (1) If
N is small, for example N < 30, some OHC edges will be eliminated for some
T SP instances at first several iterations. (2) As N is big, we preserve OHC
edges whereas it consumes much time. To compute a sparse graph in short time,
we design a heuristic algorithm based on the binomial distribution (2).
The algorithm deletes the useless edges according to frequency threshold f .
A big f value is useful to speed up edges elimination. The precondition is that
f¯(e) of OHC edges is bigger than f . The selection of K5 s is kernel to compute
big frequency for OHC edges. Given a K5 , the edges in the left of the distance
inequalities in Table 1 usually have big frequency > 4 in frequency K5 .
In K5 , an edge e ∈ OHC has two vertex-disjoint edges in OHC. For example
in Fig. 1(c), (A, B) has the two vertex-disjoint edges (C, D) and (D, E) in OHC.
Moreover, the distance inequalities d(A, B) + d(C, D) < d(A, D) + d(B, C) <
d(A, C)+d(B, D) and d(A, B)+d(D, E) < d(A, E)+d(B, D) < d(A, D)+d(B, E)
hold. In this case, the frequency of (A, B) is bigger than 4 in frequency ABCDE.
For each of the other OHC edges in K5 , they conform to the other corresponding
1050 Y. Wang
distance inequalities. This observation gives us clues to choose K5 s for OHC

edges. We assume A < B < C < D < E based on the natural ordering of
vertices in Kn . Given edge (A, B), we construct K5 s for (A, B) with the following
method. Firstly, choose a vertex D at random in case that (A, D) and (B, D) are
two edges. Secondly, randomly choose (C, D) and (D, E) edges on condition that
d(C, D) < d(A, B) and d(D, E) < d(A, B). In this way, we construct a K5 where
the distances d(A, B)+d(C, D) and d(A, B)+d(D, E) are smaller than 2d(A, B),
respectively. One can use the parameter inequality d(A, B)+d(C, D) < cd(A, B)
and d(A, B) + d(D, E) < cd(A, B) to construct K5 s for (A, B), where c is some
constant. In these K5 s, the distances d(A, B) + d(C, D) and d(A, B) + d(D, E)
are restricted. As (A, B) is short, d(A, B) + d(C, D) and d(A, B) + d(D, E) will
be small. (A, B) will have the frequency bigger than 4 in most of the frequency
ABCDEs if (A, B) has a big probability p>4 . Otherwise, d(A, B) + d(C, D) and
d(A, B) + d(D, E) will be big and the frequency of (A, B) will be small in most
frequency ABCDEs as p>4 is small. As (A, B) has a small frequency < 4 in most
of the N frequency K5 s, f¯(e) will be small and it will be eliminated according
to f .
After f¯(e) of e is computed, it will be compared with f . As f¯(e) < f , e
will be eliminated. Otherwise, it will be preserved. As we know, OHC edges
will have f¯(e) > 4. Thus, f should be bigger than 4. Plus the selected K5 s for
OHC edges, f can be assigned a bigger value, such as f > 5. After the edges
with small f¯(e) are deleted, the maintained edges will have bigger f¯(e) in the
preserved graphs. To eliminate more edges, f is improved in the computation
process. In the iterations, the average vertex degree d¯i of the ith preserved graph
is computed. If d¯i − d¯i+1 < b where b is a small value, the current f is improved by
adding a small value c, i.e., f := f +c. In addition, f has an upper bound to avoid
trimming OHC edges. The last question is to give the terminal conditions. We
record the number of preserved edges Mi at the ith iteration. The difference of Mi
and Mi+1 at two adjacent iterations is computed as Mi −Mi+1 . If Mi −Mi+1 < a
for a given small number a, the algorithm will be terminated.
5 Examples and Analysis
We will do experiments to eliminate edges for several classes of T SP instances.

The T SP instances are downloaded from T SP LIB. The Concorde package on-
line has computed an OHC for each of them. This OHC is used to illustrate
the number of lost OHC edges in preserved graphs. It mentions that certain
T SP instances may contain several or many OHCs. We are not sure if the
preserved graphs contain another OHC after a few known OHC edges are cut.
Based on the experimental results, we will show OHC edges really have a big
probability p>4 in Kn and the preserved graphs. Moreover, many useless edges
are cut according to their small frequencies.
The initial f is assigned 4.6. If d¯i − d¯i+1 < 2 at the ith and (i+1)th iterations,
f is increased by 0.1. The edges with f¯e < f will be eliminated. To reduce
computation time, the initial value of N is 30. As f is added by 0.1, N is also
Table 2. The computational results for certain TSP instances (n is T SP scale)
TSP n M1 /l1 davg dmax M2 /l2 davg dmax M3 /l3 davg dmax M4 /l4 davg dmax
att48 48 142/0 5 10
gr229 229 759/0 6 13 589/2 5 9 531/5 4 8 506/9 4 8
rd400 400 3813/0 19 49 2415/1 12 31 1691/4 8 21 1083/7 5 12
gr431 431 3703/0 17 41 1982/1 9 27 1209/4 5 11 1121/7 5 10
pcb442 442 1712/0 7 19 1349/1 6 14 1242/4 5 12 1147/8 5 10
att532 532 8326/0 31 70 4760/2 17 42 2492/5 9 22 1657/8 6 15
si535 535 8338/0 31 84 6562/3 24 69 5293/6 19 56 3801/9 14 46
pa561 561 3807/0 13 34 2584/3 9 25 1887/7 6 18
gr666 666 5568/0 16 58 3670/2 11 39 2451/5 7 25 2028/8 6 21
rat783 783 8308/0 21 38 4077/1 10 39 3203/5 8 16 2404/8 6 11
si1032 1032 43038/0 83 212 20689/1 40 77 10882/3 21 48 8469/9 16 36
d1291 1291 16868/0 26 69 10932/1 16 45 5076/5 7 23 3093/10 4 11
d1655 1655 16417/0 19 84 14633/1 17 73 11080/3 13 57 9379/5 11 51
u1817 1817 44728/0 49 116 13682/1 15 36 7770/5 8 21 5863/10 6 16
rl1889 1889 78461/1 83 231 54355/3 57 189 29536/6 31 123 11340/10 12 56
d2103 2103 50184/1 47 136 33681/2 32 102 15873/7 15 55 10158/9 9 36
u2152 2152 20009/0 18 42 15412/1 14 33 13284/3 12 28 9631/8 8 24
u2319 2319 8557/0 7 9 7802/2 6 8 4862/5 4 8 4739/7 4 7
pr2392 2392 39356/0 32 87 26771/2 22 65 12089/5 10 28 9110/9 7 20
Edges Elimination for Traveling Salesman Problem Based on Frequency K5 s
pcb3038 3038 30217/0 19 54 20602/2 13 31 13349/4 8 19 10628/9 6 14

fnl4461 4461 87600/0 39 83 44532/2 19 45 30852/4 13 31 18028/10 8 21
1051
1052 Y. Wang
increased by 5. For OHC edges, big values of f¯(e) will be computed as N rises
since they have big probabilities p>4 . As Mi − Mi+1 < 5, the heuristic algorithm
will be terminated. At each iteration, the frequency threshold f , the number
M of preserved edges, the number l of eliminated OHC edges, the minimum,
average and maximum vertex degrees dmin , davg and dmax are recorded. We
show four groups of results according to li (i = 1, 2, 3, 4), i.e., l1 = 0, 1 ≤ l2 ≤ 3,
4 ≤ l3 ≤ 7 and 8 ≤ l4 ≤ 10. The values of li and the corresponding minimum
Mi , davg and dmax are given in Table 2.
As l1 = 0, M1 = O(n log2 n) and davg = O(log2 n). The heuristic algorithm
cut many useless edges for all of the T SP instances. In addition, dmax ≤ 3davg
for nearly all of the instances as l1 = 0. It means OHC edges have big p>4 in Kn
and the preserved graphs so they are not cut. In the following process, more and
more edges are eliminated. M2 , M3 and M4 decrease quickly although only a few
known OHC edges are cut. For example as 1 ≤ l2 ≤ 3, M2 is much smaller than
M1 . In the sparse graphs, most OHC edges still have the bigger p>4 s than most
of the other eliminated edges. It indicates the heuristic algorithm works well to
delete useless edges for either dense or sparse graphs of T SP . When 4 ≤ l3 ≤ 7,
M3 < n log2 n. In this case, the algorithm has computed a very sparse graph for
T SP at the expense of losing a few OHC edges.
6 Conclusions
Frequency K5 s have good properties to be used for eliminating useless edges
for T SP . As we choose N frequency K5 s for an edge to compute its frequency,
the binomial distribution demonstrates OHC edges generally have bigger fre-
quency than most of the other edges. A heuristic algorithm is provided to cut
useless edges according to their frequencies. The probability model, binomial
distribution are verified by the experimental results.
References
1. Johnson, D.S., McGeoch, L.-A.: The Traveling Salesman Problem and its Varia-
tions, Combinatorial Optimization. 1st edn. Springer Press, London (2004)
2. Karp, R.: On the computational complexity of combinatorial problems. Networks
5(1), 45–68 (1975)
3. Held, M., Karp, R.: A dynamic programming approach to sequencing problems. J.
Soc. Ind. Appl. Math 10(1), 196–210 (1962)
4. Bellman, R.: Dynamic programming treatment of the traveling salesman problem.
J. ACM 9(1), 61–63 (1962)
5. Klerk, E.-D., Dobre, C.: A comparison of lower bounds for the symmetric circulant
traveling salesman problem. Discret. Appl. Math 159(16), 1815–1826 (2011)
6. Applegate, D., Bixby, R., Chvátal, V., Cook, W., Espinoza, D.-G., Goycoolea, M.,
Helsgaun, K.: Certification of an optimal TSP tour through 85900 cities. Oper.
Res. Lett. 37(1), 11–15 (2009)
7. Thomas, H.-C., Charles, E.-L., Ronald, L.-R., Clifford, S.: Introduction to Algo-
rithm, 2nd edn. China Machine Press, Beijing (2006)
8. Mömke, T., Svensson, O.: Approximating graphic TSP by matchings. In: FOCS
2011, pp. 560–569. IEEE, NY (2011)
9. Helsgaun, K.: An effective implementation of the Lin-Kernighan traveling salesman
heuristic. Eur. J. Oper. Res. 126(1), 106–130 (2000)
10. Sharir, M., Welzl, E.: On the number of crossing-free matchings, cycles, and par-
titions. SIAM J. Comput. 36(3), 695–720 (2006)
11. Björklund, A., Husfeldt, T., Kaski, P., Koivisto, M.: The traveling salesman prob-
lem in bounded degree graphs. ACM T. Algorithms 8(2), 1–18 (2012)
12. Correa, J.-R., Larré, O., Soto, J.-A.: TSP tours in cubic graphs: beyond 4/3. SIAM
J. Discret. Math. 29(2), 915–939 (2015)
13. Borradaile, G., Demaine, E.-D., Tazari, S.: Polynomial-time approximation
schemes for subset-connectivity problems in bounded-genus graphs. Algorithmica
68(2), 287–311 (2014)
14. Gharan, S.-O., Saberi, A.: The asymmetric traveling salesman problem on graphs
with bounded genus. In: SODA 2011, pp. 23–25. ACM (2011)
15. Jonker, R., Volgenant, T.: Nonoptimal edges for the symmetric traveling salesman
problem. Oper. Res. 32(4), 837–846 (1984)
16. Hougardy, S., Schroeder, R.-T.: Edges elimination in TSP instances. In: Kratsch,
D., Todinca, I. (eds.) WG 2014. LNCS, vol. 8747, pp. 275–286. Springer, Heidelberg
(2014)
17. Taillard, É.-D., Helsgaun, K.: POPMUSIC for the traveling salesman problem.
Eur. J. Oper. Res. 272(2), 420–429 (2019)
18. Wang, Y., Remmel, J.-B.: A binomial distribution model for the traveling salesman
problem based on frequency quadrilaterals. J. Graph Algorithms Appl. 20(2), 411–
434 (2016)
19. Wang, Y., Remmel, J.-B.: An iterative algorithm to eliminate edges for traveling
salesman problem based on a new binomial distribution. Appl. Intell. 48(11), 4470–
4484 (2018)
20. Wang, Y.: An approximate method to compute a sparse graph for traveling sales-
man problem. Expert Syst. Appl. 42(12), 5150–5162 (2015)
Industrial Symbioses: Bi-objective Model
and Solution Method
Sophie Hennequin, Vinh Thanh Ho(&), Hoai An Le Thi,

Hajar Nouinou, and Daniel Roy

sophie.hennequin@enim.univ-lorraine.fr,
{vinh-thanh.ho,hoai-an.le-thi,
daniel.roy}@univ-lorraine.fr,
hajar.nouinou6@etu.univ-lorraine.fr
Abstract. The concept of industrial symbiosis is interesting because it allows a

significant waste reuse. Indeed, when an enterprise cannot more reduce its
wastes nor reuse, it may be beneficial to sell those wastes to other factories for
which they will be raw materials. However, to achieve this, it is important to be
able to firstly, group complementary enterprises in a same area and secondly,
ensure an economic gain to each involved enterprises and a global ecologic gain
for the considered area/region. Then, we face a bi-objective problem in which
objectives could be conflicting and address this issue, propose a solution to
mathematically model this problem and propose a way to solve it. Finally, we
will apply this model and its resolution to a real study case located in China.
Keywords: Industrial symbiosis Mathematical modeling

Linear scalarization -constraint Waste management
1 Introduction
Since few years, it becomes obvious that mankind needs cannot be considered as a
necessary and sufficient priority. Indeed, the Nature’s and Earth ecosystem’s needs are
just as important because of resources availability and large natural cycles preservation
objectives [1]. We are now conscious that Earth system has limited capacities and
moreover, some of its natural resources involve very long term cycles to be renewed,
becoming, in fact, non-renewable at mankind scale. This implies that system stability,
which allows life emergence, is in fact weak and can easily shift toward other states,
less suitable to life development. It is then necessary to change paradigm to propose
new production and consumption models. This is one of the major issues of sustainable
development.
Among the different ideas and applications linked to sustainable development, the
industrial ecology is very promising to establish a link between Nature and mankind
needs by considering industrial systems as ecosystems with objectives compatible with
those of natural ecosystems [2]. We can define industrial ecology as “all practices
useful to reduce industrial pollution” and its objective is to give to industrial systems a
long-term viability and transforms them into ecofriendly systems, i.e. without negative

https://doi.org/10.1007/978-3-030-21803-4_104
Industrial Symbioses: Bi-objective Model and Solution Method 1055
impacts on environment. The concept origin comes from the metaphor between natural
and industrial ecosystems, reusing matters and wastes and hoping reduce the need in
raw materials extracted from Earth resources and having a major positive environ-
mental impact [3].
Among the existing concept developed in the frame of industrial ecology, the
industrial symbiosis seems very interesting, because it is based on an exchange rela-
tionship, beneficial to all participants. Indeed, an enterprise may find any interest in its
own waste, making it impossible to set up circular economy loops, but another
enterprise may be very interested by this same waste (cheaper, closer and/or more
accessible than traditional supply) [4]. More than that, this second enterprise may
produce some waste, which could be useful to the first one. Even if this concept is
rarely applicable considering only two enterprises, some industrial parks was designed
based on this concept, involving many enterprises to achieve a win-win (called here
symbiotic) relationship between them, which is very profitable to environment, because
wastes are considered (and used) as raw materials. In fact, this concept transforms
negative environmental externalities into positive ones like pollution reduction and
reduced raw material need.
Nevertheless, more than the only ecological benefit, involved enterprises in the IS
must also find a maximum economical gain. That is why, we propose a model which
allows to take in account simultaneously those two objectives which can be conflicting:
the improvement of one objective may lead to deterioration of the other [5]. Thus, a
single solution, which can optimize all objectives simultaneously, perhaps does not
exist. To solve our bi-objective maximization problem with logical constraints (for
example, minimum levels of replenishment can be defined according to the needs of
each plant), we propose two different solutions based on scalarization. The scalarization
is a technique, which permits to find efficient solutions employed in nearly all exact
methods and many heuristic techniques, by transforming the multi-objective problem
into a single objective problem, with additional variables and/or parameters. Then, the
single objective problem is solved repeatedly in order to find some subset of efficient
solutions of the initial multi-objective problem [6]. The first proposed scalarization
method corresponds to a linear scalarization of our mathematical problem and the
second ones to the classical -constraint method [7]. The well-known and popular e-
constraint method consists in retaining one objective and transforming the other
objective(s) into constraints [8, 9]. The linear scalarization consists in a convex com-
bination (i.e. a linear weighted sum) of all the objectives [10]. It is then well known that
if all defined weighted parameters are strictly positive for each objective, an optimal
solution of the combination is efficient (but that efficient solutions in the interior of the
convex hull of the set of non-dominated points in criterion space cannot be found).
Then, based on these two solutions, we develop a numerical study based on a real
industrial park located in China.
The rest of this article is structured as follow. After seeing deeper what an industrial
symbiosis is and which assumptions we consider, we will show in the third part the bi-
objective maximization problem model, just before proposing methods to resolve it and
analyze the obtained results. Finally, we will conclude and explore some potential
future works.
1056 S. Hennequin et al.
2 Industrial Symbioses and Assumptions
A good industrial symbiosis (IS) definition is done in [11]: “a collective approach to

competitive advantage involving physical exchange of materials, energy, water, and/or
by-products. The keys to industrial symbiosis are collaboration and the synergistic
possibilities offered by geographic proximity”, which implicitly opens the possible
symbiotic exchanges. Indeed, even if we will not treat this aspect of synergy in this
paper, by “synergistic possibilities”, we can also understand, beyond products, the
services or even infrastructures sharing, especially if we consider an industrial park in
which enterprises are close to each other.
The concept of symbioses goes deeper than reverse logistic or closed loop. Indeed,
even if those concepts are interesting and give good results about rationalization
resources (water, energy, raw materials) use, they do not go far enough and are not
sufficient to solve pollution or climate change problems. In the same way, waste
reduction processes (which is the best thing that can be done) are not applicable to all
situations or systems, especially in case of zero-waste target. But, if we integrate all
those concepts in the same industrial park, we can greatly enhance the effects. Indeed,
waste remaining after a reduction process may be used as raw material by one or more
enterprises of the park, even if the one that produces it do not need it (no possible direct
closed loop). Of course, in that case, the enterprises of the eco-park must have been
chosen carefully to be compatible.
In this paper we consider an example inspired by the case of Chinese Qijiang eco-
park located at Chongqing. This region is rich in coal, used to produce electricity for
other enterprises of the park of which the main production is aluminum [12]. The Fig. 1
shows some of the main exchanges in our example. Note that for readability reasons,
energy is not shown because power plant supplies all the other plants. In the same way,
the main extern raw material (i.e. coal for power plant) and the extern final products
(like building materials) are also not shown.
To model this system, we consider the following assumptions, where the
assumptions from 1 to 5 are considered as strong (i.e. with a huge impact on the model)
and the ones 6 & 7 are weak, with no real impact but important to frame the study.
1. Enterprises’ plants are geographically close (same industrial park).
2. Enterprises involved in symbioses are not competitor. They product different goods
and/or services on different markets.
3. Concerning the outputs of an enterprise i, we take in account only the waste. Final
products are not included in the model, because enterprises are not competitor.
4. All the costs shown here are unitary costs and time constant.
5. All the enterprises run normally, i.e. they do not stop producing.
6. We do not consider the specific case where a waste (output) of an enterprise i may
be also a possible input of this enterprise i (circular economy loop). Indeed, even if
this situation is ecologically interesting, it goes against the idea of symbioses where
several companies must cooperate.
7. The inputs coming from outside of the symbioses are supposed to be ready to use,
without any preliminary treatment, i.e. there is no treatment cost considered in the
model.
1. Construction 7. Aluminum
3. Fertilizer plant
materials plant 1 plant 1
4. Power plant 5. Electrolytic

aluminum 8. Aluminum
plant 2
2. Construction
materials plant 2 6. Secondary
aluminum plant
Desulfurized
Heat Alumina Liquid aluminum gypsum
Aluminum Alloy Aluminum waste Nitamine Slag
Fig. 1. Possible exchanges between the considered companies
Now, as we defined the problem with its intern and extern exchanges and the limits
we consider, we can develop the corresponding mathematical model, which is more
precisely a bi-objective maximization problem.
3 Bi-Objective Maximization Problem
The defined variables and parameters are given in Table 1.
Table 1. Notations
Variable Description
i 2 f1; 2; 3; . . .; N g The N different studied enterprises
k 2 f1; 2; 3; . . .; K g The K different involved product types
Rk ðiÞ Requirement of enterprise i in k type product (constant)
T k ðiÞ Threshold quantity for a k type product acceptable for
enterprise i
Wak ðiÞ Waste product by enterprise i of k type product (constant)
GðiÞ Economic profit for the enterprise i
SP ðiÞ Total selling price of enterprise i outputs (commercialized
finished products excepted)
CT ðiÞ Total cost for inputs and outputs for enterprise i
CT;in ðiÞ & CT;out ðiÞ Total cost for inputs & outputs for enterprise i
Skext ðiÞ Selling price of a type k output for enterprise i outside the IS
Skint;i ðjÞ Selling price of a type k output for enterprise i to enterprise
j inside the IS
(continued)
akext;in ðiÞ & akext;out ðiÞ Amount of inputs & outputs of type k exchanged by enterprise
i outside the IS
akint;j ðiÞ Amount of outputs of type k transferred from enterprise j to
enterprise i
k
Cext;in ðiÞ Cost of k type product imported by enterprise i from outside the
IS
k
Cint;j ðiÞ Cost of k type product transferred by enterprise i from
enterprise j
k
Ctrans;in=out ðiÞ Cost of input (resp. output) transportation of k type product
exchanged by enterprise i with outside
k
Cenv;in=out ðiÞ Environmental cost of input (resp. output) of k type product
k
Csoc;in=out ðiÞ Social cost of input (resp. output) of k type product exchanged
by enterprise i with outside
k
Ctrait;out ðiÞ Cost of input (resp. output) treatment of k type product
k
Ctrans;j ðiÞ, Ctrait;j
k
ðiÞ, Transportation, treatment, environmental and social costs of
Cenv;j ðiÞ and Csoc;j
k k
ðiÞ k type product transferred by enterprise i from enterprise j
Economic profit of an enterprise i

It considers the resale price of waste and the costs related to the treatment, environ-
ment, transportation… of inputs and outputs. For i ¼ 1; . . .; N, the economic profit of
an enterprise i in the IS is defined as follows.
GðiÞ ¼ SP ðiÞ CT ðiÞ; CT ðiÞ ¼ CT;in ðiÞ þ CT;out ðiÞ;

" #
X
K X
N
SP ðiÞ ¼ Sext ðiÞ aext;out ðiÞ þ
k k
Sint;i ðjÞ aint;i ðjÞ ;
k k
k¼1 j¼1;j6¼i
X
K X
N
CT;in ðiÞ ¼ ½akext;in ðiÞ C1k ðiÞ þ akint;j ðiÞ C1;j
k
ðiÞ;
k¼1 j¼1;j6¼i
C1k ðiÞ ¼ Cext;in

k
ðiÞ þ Ctrans;in
k
ðiÞ þ Cenv;in
k
ðiÞ þ Csoc;in
k
ðiÞ;
k
C1;j ðiÞ ¼ Cint;j
k
ðiÞ þ Ctrans;j
k
ðiÞ þ Ctrait;j
k
ðiÞ þ Cenv;j
k
ðiÞ þ Csoc;j
k
ðiÞ;
" #
XK X N
CT;out ðiÞ ¼ akext;out ðiÞ C2k ðiÞ þ akint;i ðjÞ C2;i
k
ðjÞ ;
k¼1 j¼1;j6¼i
C2k ðiÞ ¼ Ctrans;out

k
ðiÞ þ Ctrait;out
k
ðiÞ þ Cenv;out
k
ðiÞ þ Csoc;out
k
ðiÞ;
k
C2;i ðjÞ ¼ Ctrans;i
k
ðjÞ þ Ctrait;i
k
ðjÞ þ Cenv;i
k
ðjÞ þ Csoc;i
k
ðjÞ:
Need (input), waste (output) and threshold of an enterprise i

For i ¼ 1; . . .; N, k ¼ 1; . . .; K, j ¼ 1; . . .; N, we define:
X
N
Rk ðiÞ ¼ akext;in ðiÞ þ akint;j ðiÞ; ð1Þ
j¼1; j6¼i
X
N
Wak ðiÞ ¼ akext;out ðiÞ þ akint;i ðjÞ; ð2Þ
j¼1; j6¼i
akint;j ðiÞ T k ðiÞ or akint; j ðiÞ ¼ 0: ð3Þ
The logical constraint (3) verifies that the industry j is able to satisfy at least the
threshold T k ðiÞ in order to have an exchange between enterprise i and j. Indeed, a
minimum level of replenishment, denoted T k ðiÞ, is defined according to the needs of
the plant i to represent the fact that if the firm j could not supply sufficient product k to
the firm i but only few products k, it will be not interesting for the firm i.
Now we are going to propose the mathematical model of our problem with its
internal and external exchanges. It focuses on maximizing the profit of the symbiotic
flow and the economic profit of all enterprises in the IS, defined respectively as:
" ! #
K X
X N X
N
F1 ðaÞ ¼ akint;j ðiÞ akext;in ðiÞ þ akext;out ðiÞ ;
k¼1 i¼1 j¼1;j6¼i
X
N
F2 ðaÞ ¼ GðiÞ;
i¼1

where the variable a ¼ akint;j ðiÞ; akext;in ðiÞ; akext;out ðiÞ .
i;j¼1;...;N;k¼1;...;K
Finally, the bi-objective maximization problem is given by:
2
þ 2KN
maxðF1 ðaÞ; F2 ðaÞÞ s:t: a 2 RKN
þ ; ð1Þ; ð2Þ and ð3Þ: ð4Þ
Obviously, F1 and F2 are linear functions and expressed as F1 ðaÞ

¼ ha; C1 i and
F2 ðaÞ ¼ ha; C2 i where the vectors C1 ¼ 1j ðiÞ; 1 ðiÞ; 1 ðiÞ
k k k
,
i;j¼1;...;N;k¼1;...;K
C2 ¼ Skint;j ðiÞ C1;j

k
ðiÞ C2;j
k
ðiÞ; C1k ðiÞ; Skext ðiÞ C2k ðiÞ ; 1k ðiÞ ¼ 1,
i;j¼1;...;N;k¼1;...;K
1kj ðiÞ ¼ 1 if j 6¼ i, 0 if j ¼ i, 8k; i; j. Moreover, the constraints (1) and (2) are linear
while (3) is a logical constraint, which are difficult to handle.
To solve this bi-objective maximization problem, we propose in what follows two
solutions based on scalarization techniques.
4 Reformulation of (4) and Solution Methods
In this section, we reformulate (4) by dealing

with
the logical constraint (3). In par-
2
ticular, by using binary variable y ¼ ykj ðiÞ 2 f0; 1g KN
and the
i;j¼1;...;N;k¼1;...;K
boundness of akint;j ðiÞ, the logical constraint (3) is equivalent to the linear constraint:
T k ðiÞykj ðiÞ akint;j ðiÞ Rk ðiÞykj ðiÞ; for i; j ¼ 1; . . .; N; k ¼ 1; . . .; K: ð5Þ
We see that if ykj ðiÞ ¼ 1, then T k ðiÞ akint;j ðiÞ Rk ðiÞ. Otherwise, ykj ðiÞ ¼ 0.
In this case, the problem (4) can be reformulated as the following problem:
2
þ 2KN 2
maxFða; yÞ :¼ ðF1 ðaÞ; F2 ðaÞÞ s:t: a 2 RKN
þ ; y 2 f0; 1gKN ; ð1Þ; ð2Þ; ð5Þ: ð6Þ
It is worth noting that the number of continuous variables, binary variables, linear
constraints are, respectively, KN 2 þ 2KN, KN 2 , 2KN þ 2KN 2 .
Next, we propose two approaches based on the linear scalarization and the -
constraint method in order to scalarize (6) into a single-objective optimization problem
[7–10].
Solution method 1: linear scalarization. Let us denote by w1 and w2 the weight of
the objective function F1 and F2 , respectively. Assume that w1 [ 0; w2 [ 0;
w1 þ w2 ¼ 1. In this case, a reformulation for a scalarization of (6) is defined as:
2
þ 2KN 2
min ðw1 F1 ðaÞ þ w2 F2 ðaÞÞ s:t: a 2 RKN
þ ; y 2 f0; 1gKN ; ð1Þ; ð2Þ; ð5Þ: ð7Þ
Solution method 2: -constraint method. Given the upper-bound parameters

1 ; 2 , two reformulations of (6) are expressed as:
2
þ 2KN 2
min F1 ðaÞ s:t: F2 ðaÞ 1 ; a 2 RKN
þ ; y 2 f0; 1gKN ; ð1Þ; ð2Þ; ð5Þ; ð8Þ
and
2
þ 2KN 2
min F2 ðaÞ s:t: F1 ðaÞ 2 ; a 2 RKN
þ ; y 2 f0; 1gKN ; ð1Þ; ð2Þ; ð5Þ: ð9Þ
Note that three resulting problems (7), (8) and (9) take the form of mix integer linear
programs. In what follows, we propose a numerical application of our solutions based
on a real industrial park located in China.
5 Numerical Experiment
In this section, we conduct an experiment on the real case of Chinese Qijiang industrial
eco-park located at Chongqing (see Fig. 1). Suppose that suppliers outside the IS have
an unlimited production capacity. The costs, the initial input needs, the production
capacity of each enterprise in the IS for each type and the minimum input quantities
Table 2. The costs of types in the industrial symbioses ecosystem.
Type k k k k k k k k k
Cext;in =Skext Cint;j =Skint;i Ctrans;in =Ctrans;out Cenv;in =Cenv;out Csoc;in =Csoc;out Ctrait;out
1. Alumina 40000$/T 30000$/T 25$/T 1.2$/T 1.5$/T 0
2. Liquid aluminum 2000$/T 2000$/T 70$/T 1.2$/T 1.5$/T 0
3. Nitamine 300$/U 0 200$/U 1$/U 1$/U 0
4. Slag 10$/T 10$/T 80$/T 9$/T 9$/T 3$/T
5. Desulfurized gypsum 20$/T 0 50$/T 9$/T 11$/T 0
6. Carbon 92$/T – 50$/T 9$/T 11$/T –
7. Aluminum alloy 3000$/T 3000$/T 100$/T 1.2$/T 1.5$/T 0
8. Aluminum waste 0 0 30$/T 1$/T 1$/T 1.3$/T
9. Heat 50000$ 0 0 0 0 0
10. Building material 200$/T – 60$/T 1$/T 1$/T 0
Industrial Symbioses: Bi-objective Model and Solution Method
1061
accepted by enterprise i to choose its supplier are given by the following tables. Here
the numbers N ¼ 8, K ¼ 10; the indexes of enterprises and product types are,
respectively, given in Fig. 1 and Tables 2, 3, 4, 5 and 6.
Table 3. The transportation/treatment costs of types between two enterprises (in $/T).
k
Ctrans;j ðiÞ=Ctrait;j
k
ðiÞ Elect. Const. mat. Fert. 2nd Alum. Alum.
alum. plant 1 & 2 plant alum. plant 1 plant 2
Elect. Liquid 30/0 30/0
alu. alum.
Power Alumina 10/0
plant Slag 10/3
Heat 0/0 0/0 0/0
Des. 10/0
gypsum
Fert. Des. 10/0
plant gypsum
2ndary Alum. 50/0 50/0
alum. alloy
Alu. Alum. 10/1
plant 1 waste
Alu. Alum. 10/1
plant 2 waste
Table 4. The environmental/social costs of types between two enterprises (in $/T).
k
Cenv;j ðiÞ=Csoc;j
k
ðiÞ Elect. Const. Fert. 2nd Alum. Alum.
alum. mat. plant plant alum. plant 1 plant 2
1&2
Elect. Liquid 0.1/0.15 0.1/0.15
alum. alum.
Power Alumina 0.1/0.15
plant Slag 0/0
Heat 0/0 0/0 0/0
Des. 0.1/0.15
gypsum
Fert. Des. 0.1/0.15
plant gypsum
2ndary Alum. 0.1/0.15 0.1/0.15
alum. alloy
Alum. Alum. 0.1/0.15
plant 1 waste
Alum. Alum. 0.1/0.15
plant 2 waste
Table 5. The input need Rk ðiÞ/the threshold T k ðiÞ.

Enterprise Alum Liq. Nita Slag Des. Carbon Alum Alum Heat
(KT) Alum (U) (KT) gyp (KT) alloy waste (U)
(KT) (KT) (KT) (KT)
Elect. alum. 46/25
Const. mat. 25/15 45/25
plant 1
Const. mat. 25/15 50/25
plant 2
Power plant 150/150
Fert. plant 1/1 1/1
2nd alum. 11/5
Alum plant 1 18/8 5/2 1/1
Alum plant 2 20/8 5/2 1/1
Table 6. The waste Wak ðiÞ

Enterprise Alum Liq. Nita Slag Des. Alum Alum Heat Building
(KT) Alum (U) (KT) gyp alloy waste (U) material
(KT) (KT) (KT) (KT) (KT)
Elect. alum. 30
Const. mat. 70
plant 1
Const. mat. 90
plant 2
Power plant 46 1 60 60 5
Fert. plant 35
2nd alum. 10
Alum plant 1 8
Alum plant 2 8
This experiment was implemented in MATLAB R2016b and performed on a PC

Intel(R) Core(TM) i5-3470 CPU @ 3.20 GHz of 8 GB RAM. The software CPLEX
12.6 was used for solving mix integer linear programs (7), (8) and (9). The results of
linear scalarization and -constraint method for this experiment in terms of the optimal
solution a , the value F1 ða Þ, F2 ða Þ and CPU time (in seconds) are given in Tables 7,
8 and 9.
Table 7. Objective values F1 ða Þ and F2 ða Þ at the optimal solution a to (7) and the CPU time
(in seconds) with the different weights w1 and w2 .
CPU =0.5
0.2 0.8 -125 71264.9 0.012 1 4 4 25
0.4 0.6 -125 71264.9 0.010 1 4 5 35
0.5 0.5 -125 71264.9 0.032 2 3 5 25
0.6 0.4 -125 71264.9 0.020 2 4 4 25
0.8 0.2 -125 71264.9 0.010 2 4 5 25
3 4 9 1
=0.5 5 4 1 46
1 5 10 6 7 8 5
3 3 1 6 8 8 6
4 6 150 7 4 9 1
8 2 8 7 5 2 18
8 9 1 7 6 7 5
1 10 70 8 5 2 12
2 10 90 8 6 7 5
3 5 10
4 3 1
4 4 10
4 9 3
7 8 3
8 8 2
We observe from Tables 7, 8 and 9 that two approaches based on linear scalar-
ization and -constraint method are very efficient for the proposed model. In particular,
the values of objective functions at the optimal solution a are the same with the
different values of parameters in both linear scalarization and -constraint method. They
run very fast and CPU time is less than 0.04 s in all cases. We see that the optimal
solutions a obtained by two methods are slightly different. There exist 27 exchanges
between two enterprises for linear scalarization while 26 exchanges for -constraint
method. However, the total quantities akint;j ðiÞ of enterprises in the IS are the same for
both methods (=234); it is similar to the total quantities akext;in=out ðiÞ outside IS (=359).
(in seconds) with the different values of 1 .
CPU =-70000
-5000 -125 71264.9 0.008 1 3 5 35
-10000 -125 71264.9 0.008 1 4 4 25
-50000 -125 71264.9 0.007 2 4 4 25
-70000 -125 71264.9 0.007 2 4 5 50
-80000 * * * 3 4 9 1
-100000 * * * 5 4 1 46
6 7 8 5
=-70000 6 8 8 6
1 5 10 7 4 9 1
3 3 1 7 5 2 18
4 6 150 7 6 7 5
8 2 8 8 5 2 12
8 9 1 8 6 7 5
1 10 70
2 10 90
4 3 1
4 4 10
Here the symbol “*” means that
4 5 10
no feasible point is found.
4 9 3
7 8 3
8 8 2
(in seconds) with the different values of 2 . Here the optimal solution to (9) when 2 ¼ 130 is the
same as the one in Table 8.
2 F1 ða Þ F2 ða Þ CPU
100 * * *
130 −125 71264.9 0.007
150 −125 71264.9 0.007
200 −125 71264.9 0.017
6 Conclusion
In this paper, we explore the possibilities of a symbiotic relationship between several

enterprises. More precisely, we want to know if the two main objectives, which are
maximum economic profit and waste reuse, are compatible. To check this and be able
to obtain some analyzable results, we firstly model an industrial symbioses ecosystem,
using a bi-objective mathematical model in which the constraints are mainly link to
amounts of products that can be exchanged inside the IS, considering the involved
enterprises requirements and their waste production. It can be expressed as a bi-
objective maximization problem with logical constraints. Then, secondly, we refor-
mulate this problem by converting these logical constraints into linear constraints and
propose two solution methods based on the linear scalarization and the e-constraint
method for the resulting problem. Finally, we use values inspired by the real case of
Chinese Qijiang eco-park located at Chongqing to numerically solve the problem
following the two proposed resolution methods and then compare the results.
As a matter of perspective, we of course can speak about the opportunity of taken in
account the limited capacities of many things we do not consider here. More specifi-
cally: the transportation means (limited in charge/volume, but also in tour frequency),
the treatment devices/installation (which may severely reduce the availability or
profitability of a specific waste if the process is expensive and/or long) and of course
enterprises themselves which could produce waste at a rhythm which may be non-
compatible with other enterprises’ needs. Then, adding these constraints about time and
capacity will improve the model accuracy.
References
1. Barnosky, A.D., Hadly, E.A., Bascompte, J., Berlow, E.L., Brown, J.H., Fortelius, M., Getz,
W.M., Harte, J., Hastings, A., Marquet, P.A., Martinez, N.D., Mooers, A., Roopnarine, P.,
Vermeij, G., Williams, J.W., Gillespie, R., Kitzes, J., Marshall, C., Matzke, N., Mindell, D.
P., Revilla, E., Smith, A.B.: Approaching a state shift in Earth’s biosphere. Nature 486, 52–
58 (2012)
2. Frosch, R.: Industrial ecology: a philosophical introduction. Proc. Natl. Acad. Sci. USA 89,
800–803 (1992)
3. Ayres, R.U.: Industrial Metabolism in Technology and Environment (1989)
4. Trevisan, M., Nascimento, L.F., Madruga, L.R.D.R.G., Mülling, D.N., Figueiró, P.S.,
Bossle, M.B.: Industrial ecology, industrial symbiosis and industrial Eco-parc: to know to
apply. Syst. Manag. 11, 204–215 (2016)
5. Wiecek, M.M., Ehrgott, M., Engau, A.: Continuous multiobjective programming. In:
Multiple Criteria Decision Analysis, pp. 739–815. Springer, New York (2016)
6. Ehrgott, M.: A discussion of scalarization techniques for multiple objective integer
programming. Ann. Oper. Res. 147(1), 343–360 (2006)
7. Hwang, C.-L., Masud, A.S.M.: Multiple Objective Decision Making, Methods and
Applications: A State-of-the-Art Survey. Springer-Verlag (1979)
8. Mavrotas, G.: Effective implementation of thee-constraint method in multi-objective
mathematical programming problems. Appl. Math. Comput. 213, 455–465 (2009)
9. Miettinen, K.: Nonlinear Multiobjective Optimization. Springer (1999)
10. Holzmann, T., Smith, J.C.: Solving discrete multi-objective optimization problems using
modified augmented weighted Tchebychev scalarizations. Eur. J. Oper. Res. 271, 436–449
(2018)
11. Chertow, M.R.: Industrial symbiosis: literature and taxonomy. Ann. Rev. Energy Environ.
25, 313–337 (2000)
12. Li, B., Xiang, P., Hu, M., Zhang, C., Dong, L.: The vulnerability of industrial symbiosis: a
case study of Qijiang Industrial Park China. J. Cleaner Prod. 157, 267–277 (2017)
Intelligent Solution System Towards Parts
Logistics Optimization
Yaoting Huang1(B) , Boyu Chen1 , Wenlian Lu1,2,3,4 , Zhong-Xiao Jin5 ,

and Ren Zheng5
1
Fudan University, Shanghai, China
huangyt16@fudan.edu.cn
2
Key Laboratory for Contemporary Applied Mathematics, Shanghai, China
3
Key Laboratory of Mathematics for Nonlinear Science, Ministry of Education,
Fudan University, Shanghai, China
4
Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence,
Ministry of Education, Fudan University, Shanghai, China
5
SAIC Motor Artificial Intelligence Laboratory, Shanghai, China
Abstract. Due to the complication of the presented problem, intelli-

gent algorithms show great power to solve the parts logistics optimization
problem related to the vehicle routing problem (VRP). However, most of
the existing researches to VRPs are incomprehensive and failed to solve
a real-work parts logistics problem. In this work, towards SAIC logistics
problem, we propose a systematic solution to this 2-Dimensional Load-
ing Capacitated Multi-Depot Heterogeneous VRP with Time Windows
by integrating diverse types of intelligent algorithms, including, a heuris-
tic algorithm to initialize feasible logistics planning schemes by imitating
manual planning, the core Tabu Search algorithm for global optimization,
accelerated by a novel bundle technique, heuristically algorithms for rout-
ing, packing and queuing associated, and a heuristic post-optimization
process to promote the optimal solution. Based on these algorithms, the
SAIC Motor has successfully established an intelligent management sys-
tem to give a systematic solution for the parts logistics planning, superior
than manual planning in its performance, customizability and expand-
ability.
Keywords: Parts logistics · Vehicle routing problems
1 Introduction
Parts logistics optimization, aiming at solving the minimum cost of transporting
all the required parts to its destination, is vital to modern industrial activities
Jointly supported by Natural Sciences Foundation of China under Grant (No.
61673119), the key project of Shanghai Science & Technology (No. 16JC1420402),
Shanghai Municipal Science and Technology Major Project (No. 2018SHZDZX01) and
ZHANGJIANG LAB, the Shanghai Committee of Science and Technology (Grant No.
14DZ1118700).
https://doi.org/10.1007/978-3-030-21803-4_105
1068 Y. Huang et al.
for all manufacturing enterprises, which is the key link of the supply chain man-
agement (SCM) and has been long studied [17]. In SCM researches, Supply chain
integration is considered as a key factor of achieving improvement [15,16] and
have already become a powerful tool in real-world economic activities [1,14].
The main component of parts logistics problem is the vehicle routing problem
(VRP), introduced by [2,8], that is, designing optimal delivery collection routes
from depots to a number of cities or customers, subjecting to side constraints [12].
There are some researches on VRP variants. VRP with multiple depots which
vehicle can choose starting from is researched by [3]. Replenishment concept
for VRPs is introduced by [6], where vehicles can replenish capacity in several
stations. VRPs with time-window constraints (VRPTW) are surveyed by [5].
Multiple uses of vehicles (VRPMU) is considered in [7]. 2-dimensional loading
constraints to VRP (2L-CVRP) is considered in [11] which uses branch-and-
bound for checking loading feasibility. Tabu Search is utilized to solve 2L-CVRP
by [9]. However, above researches only considered individual or partial combina-
tions of above constraints. Due to this circumstances SAIC motor is still utilizing
old-fashion manual planning scheme.
This work incorporates above works and hereby proposes the 2-Dimensional
Loading Capacitated Multi-Depot Heterogeneous Vehicle Routing Problem with
Time Windows.
2.1 Problem Scenario Description
Fig. 1. An example route: a truck departed from a vehicle yard, visited suppliers,
warehouses and the hub in an intermittent way, and finally ending its daily task by
returning to the truck yard.
Typical parts logistics are briefly described as follows: the supply chain sys-
tem requests parts from various suppliers to be delivered to plant warehouses
for assembly. Herein, the basic transportation unit is named as shipment. Each
shipment is composed of a bunch of same type bins, which should be delivered
from a specific supplier to a specific plant warehouse. These transportation are
carried out by the logistic department with adequate trucks of different models,
Intelligent Solution System Towards Parts Logistics Optimization 1069
starting off and ending from a specific vehicle yard. The pickup and load at the
plant warehouse occur at its limit number of docks, each of which allows several
trucks to pickup/load at specific given time-windows. These activities of loading
and unloading cost fixed time lengths to be completed. A hub is also considered
in this problem, which is able to receive scattered shipments to be integrated.
Figure 1 illustrates a typical route by a truck. The 3D packing problem is sim-
plified to a 2D packing problem, As the bins should initially be stacked into
columns following stacking rule or with pallets. The information of a shipment
contains: supplier, plant warehouse, bin quantity, bin size (length, width and
height), stacking layer limitation, pallet requirement, pickup time interval and
delivery time-interval.
The complication of this problem owes to the numerous constraints. The
main constraints are of the Time-window constraint (TW).
A1. Suppliers and plants have working time-windows so that the whole time-
interval of the visit of the truck, including the loading or unloading should
be contained in this working time-window.
A2. According to the limit of the dock of the supplier and warehouse, there
should be only limited number of trucks visiting this specific depot at the
same time. More trucks than this number leads queueing.
A3. Each shipment has its pickup and delivery time-intervals.
Another group of important constraints are of loading/unloading constraints.
B1. Bin stacking should follow given rules: the bins of the same type from the
same supplier can be stacked together and some specific types must be
loaded on pallets, which can be stacked. The stacking layers are limited.
B2. The bins must be contained within the loading surface of the truck, and no
two bins can overlap.
B3. Because the shipments will be handled by forklift, sequence constraint of
loading should be considered [10]: when a location is visited, and the bins of
the corresponding lot, can be downloaded or uploaded through a sequence
of straight movement parallel to the width of the loading area.
There are additional essential constraints: according to the company’s rule, the
shipments from different cities cannot be loaded on a single truck; docks of some
suppliers restrict the truck lengths. Some suppliers restrict the number of visit
times; a very few suppliers request to be the first to be visited site of on a route
and some a few warehouses request to be the last. We add judgement statement
in coding and won’t discuss them further.
In the overview, the followings are decisions to decide a feasible planning:
– Select which warehouses and suppliers a truck visits;
– Plan the route for this truck;
– Pack specific bins with constraints.
We utilize total mileage as cost function, and our object is to minimize it.
2.2 Mathematical Model

We define X = (xij ) as the truck-shipment relations, where xij = 1 indicate
that shipment j will be handled by truck i, otherwise xij = 0, and we define
yik as the kth station the truck i will visit. Other symbols of the mathematical
model are defined in Table 1. Constraints are listed in Table 2. Thus, following
the description above, the parts logistics planning is formulated by the following
optimization problem:
ni
N

Minimize Ci D(yik , yi(k+1) ) (1)
i=1 k=0
Table 1. The parameters and decision variables
Parameter/decision variable Definition

Ci Cost per mileage unit of truck i
tik Departure time of truck i at kth location
wik Waiting time of truck i at kth location
ni Number of locations truck i should visit
NS Number of shipments
dj Location order of a route the shipment j will
be delivered
pj Location order of a route the shipment j will
be picked up
Pj If shipment i need pallet then Pj = 1, other-
wise 0
ujθ , vjθ Width and length coordinate of the θth col-
umn of shipment i on a truck
ujθγ , vjθγ Width and length coordinate of the γth col-
umn on θth pallet, only valid when Pj = 1
Wj , Lj Bin width and length of shipment i
P P
W ,L Pallet width and length
WiV , LVi Width and length of truck i
nsj Number of column stacked by shipment j
bpjθ Number of column in pallet θ, only valid
when Pj = 1
NjB Number of bins of shipment j
Lj Bin stack layer limitation of shipment j
hjj Equals 1 if shipment j, j was the divided
shipment, and destination of j, source of j
are the hub, otherwise 0
continued
Table 1. continued
Parameter/decision variable Definition

D(ζ, η) Distance between two locations ζ and η
T W (ι) Working time interval of a location ι
T (ζ, η) Time cost between two locations ζ and η
T H(ι) Handling time in a location ι
Ψ(n, A) The smallest of the n largest number in set
A
DC(ι) Dock number of a location ι
T P (ω) Pickup time interval of a shipment ω
T D(ω) Delivery time interval of a shipment ω
3 Solution
3.1 Algorithm Architecture
We divide the whole algorithm into three procedures: initialization, optimization

and post-optimization. Figure 2 illustrates the work-flow underlying algorithms:
the Initialization is to generate a feasible solution; the optimization, which in our
case is tabu search, adopts this initial solution and optimize at the shipment level
towards minimizing the total mileage; the post-optimization further improves
at the route level. In addition, there are an axillary 2-Dimensional Loading
Capacitated VRP with Time Window (2L-VRPTW) solver, which generates a
route of the given truck and shipments and the associated packing schemes.
3.2 2L-VRPTW Solver
Given an X, 2L-VRPTW Solver generates route and packing scheme for each
truck satisfying A1, A3 and B series constraints. The algorithm goes as following:
an asynchronous route search procedure search all the routes space, which is
built as a tree. Branches will be pruned if it violates A1 and A3, or its partial
mileage is larger than searched feasible leaf nodes. When we reach the leaf node
(representing a complete route), we will judge feasibility with respect to B series
constraints. This packing procedure is conducted by the following steps: first,
stacking the bins and pallets into columns; second, pre-justifying whether the
packing at this site is possible by a preassigned threshold of the ratio of the
coverage area of the columns over the total. If the pre-justification is passed, 2-
D Packing Problem (2PP) is solved by searching, following a heuristic function
with respect to the wasted area, convexity and covered area. We reduce the
search space by limiting the search width.
Table 2. Constraints
Constraint Formula
for ∀i ∈ {1, 2, · · · , N }, ∀k ∈ {0, 1, · · · , ni }:
yi0 = yin = y0 , ti0 = 0
Working TW (A1) i
ti(k+1) = tik + T (yik , yi(k+1) ) + wi(k+1)
tik + T (yi k, yi(k+1) ) ∈ T W (yi(k+1) )
for ∀i ∈ {1, 2, · · · , N }, ∀k ∈ {0, 1, · · · , ni }:
Queue and dock(A2) wi(k+1) = T H(yi(k+1) ) + (Ψ (DC(yi(k+1) , {t |t +
i (k +1) i k
T (y , y ) ≤ tik + T (yik , yi(k+1) )})) − (tik + T (yik , yi(k+1) )))
i k i (k +1)
for ∀j ∈ {0, 1, · · · , N S } :
N N N N
Shipment TW(A3)
i=1 tipj xij ∈ T P ( i=1 yipj xij ), i=1 tidj xij ∈ T D( i=1 yidj xij )
pj < dj
for ∀j ∈ {0, 1, · · · , N S }, ∀θ ∈ {0, 1, · · · , nsi }:
B
θ ljθ = Nj , ljθ ≤ Lj
p
for ∀j, θ, and γ ∈ {0, 1, · · · , b } and Pj = 1:
Stacking Rule (B1) θ
0 < ujθγ ≤ W P − Wj , 0 < vjθγ ≤ LP − Lj

N Bj = θ l
γ jθγ jθγ, l ≤ Lj
∀γ = γ one of the followings must be true:
≥ ujθγ + Wj , ujθγ ≥ u + Wj
jθγ jθγ
u
≥ vjθγ + Lj , ujθγ ≥ viθγ + Lj
jθγ

for ∀j, j ∈ {0, 1, · · · , N }, ∀θ, θ ∈ {0, 1, · · · , ns
S }:
V i V
0 < ujθ ≤ i (xi j Wi xij − Wj ), 0 < vjθ ≤ i (xi j Li xij − Lj )
one of the followings must be true:
2-D loading(B2) P
i xij uj θ ≥ i xij (ujθ + Wj (1 − Pj ) + Pj W )
P
i xij ujθ ≥ i xi j (uj θ + Wj (1 − Pj ) + Pj W )
P
i xij vj θ ≥ i xij (vj θ + Lj (1 − Pj ) + Pj L )
P
i xij vjθ ≥ i xi j (vj θ + Lj (1 − Pj ) + Pj L )
for all j = j , dj < dj and pj ≤ pj one of the followings must be true:
P
Loading sequence(B3) i xij ujθ ≥ i xi j (uj θ + Wj (1 − Pj ) + Pj W )
P
i xij vj θ ≥ i xij (vj θ + Lj (1 − Pj ) + Pj L )
P
i xij vjθ ≥ i xi j (vj θ + Lj (1 − Pj ) + Pj L )

Hub shipments for ∀j = j : 0 ≤ [ i tip xij − i tip xij ]h
j j jj
Shipments must be loaded for j 0, 1, , NS : N
i=1 xij = 1
Fig. 2. An overview of algorithm structure. Initialization and optimization will only

configure shipment-truck relations X, which will then generate routes with A1, A3
and B series constraints, after that global time window (GTW) feasibility (A2 con-
straint) will be checked. The post-optimization process will configure Y for further
improvement.
3.3 Initialization
A good initialization scheme is generally believed to be essential to the optimiza-

tion algorithm. Herein, we initialize by imitating the manual planning: First, we
cluster all the suppliers into several disjoint areas by a community detection
algorithm [13], and then subgroup all shipments into several subsets according
to this supplier area clustering. Within a shipment subset, we sort the shipments
in a descending order based on the distance between supplier and warehouse.
Then we assign trucks to load these shipments in a one-by-one manner unless
some constraints are not satisfied. Towards agreement with the A2 constraint,
when finding truck violating, we delete the existing truck, and add all its assigned
shipments into a sequence. After all other shipments are assigned, we deal with
the sequence with the same algorithm. This process will be done several times
until all trucks satisfying A2.
3.4 Optimization—Tabu Search
We implement tabu search (TS) algorithm for optimization. TS can be illustrated

as choosing the optimal neighborhood solution and declare the reverse move
tabu. The neighborhood is given as follows: a new solution X is generated from
given X by moving a shipment from one truck to another truck. This operation
is costly with numerous shipments. To reduce the complexity, the shipments by
the same supplier and warehouse destination are bundled, and the movement is
restricted within the bundle. We denote the bundle movement, from X to X ,
(X → X ) (Fig. 3).
Fig. 3. Tabu search
The optimization process fetches initial solution, then bundles all the ship-
ments. At each TS step, all neighborhood solutions will be evaluated asyn-
chronously, and the reverse bundle move will be declared tabu. We also keep
track of the best solution so far. The overall xrithm is shown in Algorithm 1.
Algorithm 1 TS Optimization
Require: Initial scheme X0 , bundle break threshold B
1: Bundle all shipments
2: initial tabu list T
3: set X = X0 , best solution keeper X ∗
4: repeat
5: Evaluate all non-tabu neighbor solution {Xk } of X, where (X → Xk ) ∈
/T
6: Choose X ∈ {Xk } with the least mileage
∗
7: if mileage of X ≤ mileage that of X then
8: X∗ ← X
9: end if
10: T = T ∪ {(X ← X )}
11: X ← X
12: until given computation resorces reached
13: return X ∗
3.5 Post-optimization
Post-optimization process aims to optimizations which are hard to consider in

main optimization process and brings little descent of cost. Firstly, we will try
to replace larger trucks with smaller ones. Then we try to merge routes where a
truck will run the sub-routes one after another, so it will not return to its depot
after one sub-route is done, with the cost of occupying more time windows. It is
done by heuristic algorithm, where we do emergence until there are no mergeable
route. To determine the sequence of sub-routes, here we define distance from one
sub-route to another as the distance from one sub-route’s last non-yard station
to another one’s first. Thus, the problem can be viewed as a common Travel
Salesman problem (TSP), and we solve it by dynamic programming [4].
4 Experiment and Result
We tested on a data set of 311 shipments obtained from SAIC logistic division,
containing 54 variants of bins, alone with the data of 45 suppliers and 8 plant
warehouses. We run the test on Intel Core i7 CPU 2.70 GHz*4. We ran TS for
5000 s and compared TS with (TS-WB) and without (TS-NB) shipment bundle.
Table 3 shows the results.
Table 3. Post-optimization results
Original After post-opt Diff

Initialization 1131.80 1099.40 −32.40
TS-WB 503.30 482.50 −20.80
TS-NB 1014.60 1014.60 0.00
(a) Bundle Efficiency (b) Robustness
Fig. 4. Results of optimization process. (a) shows the mileage descending process with
bundle and without bundle. (b) shows the robustness across 24 hyperparameter com-
binations (all with bundle). x-axis indicated ranking of hyperparameter combination
in descent order, and y-axis indicate the total mileage of each combination
In Table 3, we can see that post-optimization can only have limited effect.
This is because optimization and pot-optimization compete for time-window
occupation. However, Post-optimization is necessary, because in circumstance
when a massive number of shipments required between two very close stations,
it is always better to arrange one truck to carry these shipments back and forth
without going back to truck yard, such circumstance is difficult to be considered
in optimization process. The mileage changing curve with respect to time is
shown in Fig. 4(a). We can see that the bundle technique, leading to larger
TS step, made it easier to escape local minima, ended in better convergence.
Robustness experiment in Fig. 4(b) shows that, TS algorithm is very robust,
referring to initial mileage scale.
SAIC Logistic Management System (SPRUCE): Our research is assisted by SAIC
motor, and adopted into their auto part logistic management system. The Shang-
hai Auto Incorporate Company (SAIC) Motor Co. Limited is the largest auto
maker of China and has its own supply chain system that contains more than 500
suppliers, 4 plants and delivers more than 10000 shipments daily. The algorithm
above is utilized for the SAIC Motor to build up a parts logistics scheme man-
agement system consisting of several modules, illustrated in Fig. 5. Data Mainte-
nance Module maintains station and truck states, pack parts into bins. Shipment
Management Module accepts new shipment and then transfers them into Global
Optimization Module. This Module will possibly utilize Manual Planning Mod-
ule to Handle temporary shipments and very important shipments. The results
will be eventually processed by Graph Generator to routing map, stowage plan,
time plan and schedule of truck resources. The whole system can handle 2000
shipments in about 10 min to 15 min.
Fig. 5. SAIC Logistic Management System.
5 Conclusion
In this paper, we established a parts logistics optimization model, which is math-
ematically a 2-Dimensional Loading Capacitated Multi-Depot Heterogeneous
Vehicle Routing Problem with Time Windows, and presented algorithms for its
systemic solution, by using TS accelerated by pruning methods and shipment
bundling techniques. This systemic solution has shown efficient power to the
optimization problem and has been utilized for the SAIC to establish its parts
logistics scheme management systems. We will concentrate on further acceler-
ating the computation. For instance, for the 2PP, a deep Q-learning method
is being used and another effort is taken to incorporate genetic algorithms by
utilizing parallel computing.
References
1. Akintoye, A., McIntosh, G., Fitzgerald, E.: A survey of supply chain collaboration
and management in the UK construction industry. Eur. J. Purch. Supply Manag.
6, 159–168 (2000)
2. Anbuudayasankar, S.P., Ganesh, K., Mohapatra, S.: Survey of methodologies for
TSP and VRP. In: Models for Practical Routing Problems in Logistics, pp. 11–42.
Springer International Publishing, Cham (2014)
3. Angelelli, E., Grazia Speranza, M.: The periodic vehicle routing problem with
intermediate facilities. Eur. J. Oper. Res. 137(2), 233–247 (2002)
4. Bellman, R.: Dynamic programming treatment of the travelling salesman problem.
J. ACM (JACM) 1, 61–63 (1962)
5. Cordeau, J.F., Desaulniers, G., Desrosiers, J., Solomon, M.M., Soumis, F.: VRP
with Time Windows (1999)
6. Crevier, B., Cordeau, J.F., Laporte, G.: The multi-depot vehicle routing problem
with inter-depot routes. Eur. J. Oper. Res. 176(2), 756–773 (2007)
7. Taillard, É.D., Laporte, G., Gendreau, M.: Vehicle routeing with multiple use of
vehicles. J. Oper. Res. Soc. 47(8), 1065–1070 (1996)
8. Dantzig, G.B., Ramser, J.H.: The truck dispatching problem. Manag. Sci. 6(1),
80–91 (1959)
9. Gendreau, M., Iori, M., Laporte, G., Martello, S.: A Tabu search heuristic for the
vehicle routing problem with two-dimensional loading constraints. Networks 51(1),
4–18 (2008)
10. Iori, M., Salazar-Gonzalez, J.J., Vigo, D.: An exact approach for the vehicle routing
problem with two-dimensional loading constraints. Transp. Sci. 41, 253–264 (2007)
11. Iori, M., Salazar-González, J.J., Vigo, D.: An exact approach for the vehicle routing
problem with two-dimensional loading constraints. Transp. Sci. 41(2), 253–264
(2007)
12. Laporte, G.: The vehicle routing problem: an overview of exact and approximate
algorithms. Eur. J. Oper. Res. 59(2), 231–247 (1991)
13. Martelot, E.L., Hankin, C.: Fast multi-scale community detection based on local
criteria within a multi-threaded algorithm. Comput. Sci. (2013)
14. Olhager, J., Selldin, E.: Supply chain management survey of Swedish manufactur-
ing firms. Int. J. Prod. Econ. 89(3), 353–361 (2004)
15. Romano, P.: Co-ordination and integration mechanisms to manage logistics pro-
cesses across supply networks. J. Purch. Supply Manag. 9(3), 119–134 (2003)
16. Tan, K., Kannan, V.R., Handfield, R.B., Ghosh, S.: Supply chain management: an
empirical study of its impact on performance. Int. J. Oper. Prod. Manag. 19(10),
1034–1052 (1999)
17. van der Vaart, T., van Donk, D.P.: A critical review of survey-based research in
supply chain integration. Int. J. Prod. Econ. 111(1), 42–55 (2008)
Optimal Air Traffic Flow Management
with Carbon Emissions Considerations
Sadeque Hamdan1,4(&) , Oualid Jouini1 , Ali Cheaitou2,4 ,

Zied Jemai1,6 , Imad Alsyouf2,4 , and Maamar Bettayeb3,5
1
Laboratoire Genie Industriel, CentraleSupélec, Université Paris-Saclay, 91192
Gif-sur-Yvette, France
sadeque.hamdan@supelec.fr,
{oualid.jouini,zied.jemai}@ecp.fr
2
Industrial Engineering and Engineering Management Department,
University of Sharjah, Sharjah, United Arab Emirates
{acheaitou,ialsyouf}@sharjah.ac.ae
3
Electrical and Computer Engineering Department, University of Sharjah,
Sharjah, United Arab Emirates
maamar@sharjah.ac.ae
4
Sustainable Engineering Asset Management (SEAM) Research Group,
University of Sharjah, Sharjah, United Arab Emirates
5
Center of Excellence in Intelligent Engineering Systems (CEIES),
King Abdulaziz University, Jeddah, Kingdom of Saudi Arabia
6
OASIS – ENIT, University of Tunis Elmanar, Tunis, Tunisia
Abstract. Air Transportation contributes in more than 2% of the total global

emissions. In this paper, we formulate the air traffic flow management (ATFM)
problem as a bi-objective mixed integer linear programming model that mini-
mizes the carbon dioxide (CO2) emissions and the total delay cost. The model is
solved using a Pareto-based scalarization technique called the weighted com-
prehensive criterion method. A numerical example is used to illustrate the effect
of considering CO2 emissions on the ATFM network. A Pareto front is devel-
oped to illustrate the trade-off between CO2 emissions and the total delay costs.
The results showed that reducing 1 kg of CO2 emissions comes at a delay cost
of 1.22 €. This result can be beneficial for decision makers in determining
penalty values and setting aviation emission policies.
Keywords: Air holding Ground holding Air traffic flow management

Environment Carbon emissions
1 Introduction
Environmental aspects are receiving considerable attention in the transportation sector

[1], where Transportation is recognized as one of the largest and growing contributor to
emissions in industrialized countries [2]. Air Transportation contributes more than 2%
of the total global emissions, and the international civil aviation organization (ICAO)
expects that emissions will increase by around 70% compared to 2005 by 2020 [3].
Since 2012, Air Transport CO2 emissions have been considered in the European Union
https://doi.org/10.1007/978-3-030-21803-4_106
Optimal Air Traffic Flow Management with Carbon Emissions 1079
emissions trading system, where all airlines in Europe are required to monitor and
report their emissions. Within this emissions trading system, each airline is given an
annual tradeable emission level for its flights [3]. As a result, in addition to minimizing
costs, airlines are concerned about minimizing their emissions.
Fuel burning is the source of transportation emissions. Among the components
emitted during the fuel burning, carbon dioxide (CO2) has received a significant focus
in many transportation-related pieces of research. Burning one liter of aviation fuel
emits around 2.527 kg of CO2 [4]. Fuel burning is related to the speed, where
according to [5], fuel consumption decreases as the speed increases up to certain speed
level after which the fuel consumption starts to increase as the speed increases. Speed
levels affect the time needed to travel a certain distance and thus it has an effect on the
network capacities and consequently on the delays. Despite the importance of the
aircraft speed level and its effect on the CO2 emissions, network models on the air
traffic flow management only used the speed to control the network capacities. This
paper links the CO2 emissions with the speed level and proposes a bi-objective mixed
integer network model that considers the total network delays and the CO2 emissions in
the air traffic flow management (ATFM) problem. We illustrate the effect of CO2
emissions on the total network delays.
This paper is structured as follows. Section 2 presents the relevant work on the air
traffic flow management network models. Section 3 provides the bi-objective model
formulation. Section 4 illustrates the bi-objective model with a numerical example.
Finally, Sect. 5 concludes the paper and provides some suggestions for future work.
2 Literature Review
Several works aimed to address different configurations of ATFM models. Bertsimas

and Patterson [6] developed a flight-by-flight binary ATFM model that accounts for en-
route sector capacities and minimizes the total delay costs. Bertsimas and Patterson [7]
proposed a dynamic ATFM model that considers multiple paths. Lulli and Odoni [8]
introduced fairness in delay distribution in ATFM model. Moreover, Agustín et al. [9]
proposed a deterministic multiple costs objective functions in ATFM model which are
cancelation costs, changing path costs, ground and air delays costs. They also presented
the stochastic version of the model in [10]. Diao and Chen [11] minimized the fuel cost
and the delay cost in an ATFM model with abstracted airway instead of airspace
sectors. Mukherjee and Hansen [12] developed a stochastic ATFM with rerouting.
Andreatta et al. [13] considered the interaction of hubs in a stochastic ATFM network.
Furthermore, Bertsimas and Gupta [14] included slot trading, reversal and overtaking
between flights in ATFM model. Chen et al. [15] proposed a spark based optimization
technique to solve large scale integer ATFM models. Hamdan et al. [16] proposed a
fair ATFM model with rerouting. Hamdan et al. [17] studied the CO2 emissions in
ATFM models based on the flight occupational rate where CO2 mainly depends on the
number of passenger, and the length of the flight. Akgunduz and Kazerooni [18]
presented a non-time segmented ATFM model that minimizes the arrival delays in
addition to the fuel consumption cost.
1080 S. Hamdan et al.
To the best of the authors’ knowledge, the main focus in ATFM network models
was the minimization of the total network delay costs, where most of the ATFM
models were single objective models. In addition to that, CO2 emissions as function of
speed and their effects on the network delays were not studied in the literature. This
paper targets to fill this gap by introducing a bi-objective mathematical model that will
help in studying the tradeoff between CO2 emissions and network delays.
3 Emissions-Delay Network Model
In this section, we present a network optimization model for managing flights taking
into consideration airports’ capacities, en-route sector capacities, delays and CO2
emissions. The model proposed in this work considers emissions during the flight
cruising stage. Due to data unavailability, landing and climbing are assumed to be
identical and were ignored in this model. In the upcoming sections, we present the
linear approximation of fuel consumption function used to calculate the CO2 emissions.
Then, we provide the model sets, parameters and decisions variables. After that, we
present the objective functions and the constraints. Finally, we illustrate the solution
technique.
3.1 CO2 Emissions Approximation and Calculations

According to the report of Clarke et al. [5], the fuel consumption has a non-linear
relationship with the speed. As the speed increases, the fuel consumption decreases up
to some point after which that fuel consumption starts to increase with the increase in
the speed (see Fig. 1, [18]).
100
Fuel consumption (L/hr)
80
60
40
20
0
350 550 750 950 1150
True airspeed (km/h)
Fig. 1. Fuel consumption versus the true airspeed (Source [18]).

In the ATFM model, the aircraft speed is not considered as a decision variable,
instead the common decision variable is the time spent in each sector. As a result, the
speed is controlled by adjusting the time spent in each sector to travel a known
distance. Thus, we plot the relation between the fuel consumption and the speed inverse
(see Fig. 2), which can be multiplied by the distance to result in the time.
Piecewise Linear Approximation Industrial data

100
90
80
Fuel consumption (L/h)
P2
70
60
50
40
30
20
10 P1 Popt
0
0.0008 0.0013 0.0018 0.0023 0.0028
Inverse of the airspeed (h/km)
Fig. 2. Fuel consumption versus the speed inverse
The fuel consumption with respect to the inverse of speed can be approximated
using two linear functions. The first linear function connects between the first point (P1)
and the optimal point (Popt) and the second linear function connects between the second
point (P2) and the optimal point (Popt). The two functions are defined as follows:
F1 ðtÞ ¼ F1 D S1 t þ S1 r1 D; ð1Þ
F2 ðtÞ ¼ F2 D þ S2 t S2 r2 D; ð2Þ
where F1 ðtÞ and F2 ðtÞ are the fuel consumption function with respect to the time
needed to travel a distance D in liter (L) using the first or the second linear lines
respectively. F1 and F2 are the fuel consumption at P1 and P2 respectively
(F1 ¼ 25:36225 L=km, F2 ¼ 92:74225 L=km), D is the distance travelled (km) S1 and
S2 are the slope between P1 and Popt and the slope between P2 and Popt respectively
(S1 ¼ 14722:2 L=h, S2 ¼ 43228:5 L=h), t is the time spent (h) and r1 and r2 are the
speed inverse at P1 and P2 respectively (r1 ¼ 0:0009 h=km and r2 ¼ 0:0027 h=km).
3.2 Sets
• F : Set of flights.
• T : Set of time periods.
• K: Set of all airports in the ATFM network.
• P f : Set of all sectors that constitute the path of flight f.

• C: Set of pairs of flights that are continuing ðf ; f 0 Þ, where flight f and f 0 are successor
and predecessor flights, respectively (using the same aircraft).
• Tjf : List of possible flight time periods for flight f in resource j 2 P f [ K (i.e.,
f f
airport or sector). Tjf ¼ T j ; . . .; T jf , where T j and T jf are the earliest possible time
and the latest possible time for f to enter j.
3.3 Parameters
• TD : The duration of each time period t in hours.

• Ca , Cg : Delay cost for holding a flight in the air or at the ground for one period t,
respectively.
• ECO2 : CO2 emission factor in kg/L of burned fuel.
• orignf , destf : The departure and arrival airport of flight f respectively.
• P f ðj þ 1Þ; P f ðj 1Þ : Subsequent and preceding sector of the jth sector for flight f,
respectively.
• Dk ðtÞ, Ak ðtÞ: The departure and arrival capacities of airport k respectively at time t.
• Sj ðtÞ: The capacity of sector j at time t.
• afj : The arrival time of flight f in resource j as per the published flight schedule.
• dfk : The departure time of flight f from airport k as per the published flight schedule.
• sf : The minimum turnaround time for flight f needed to take-off after the arrival of
its predecessor flight f 0 in the case of continuing flights.
3.4 Decision Variables
• wj;tf : A binary variable that is equal to one if flight f arrives at resource j by time t. In
other words, if wj;tf ¼ 1 at any time period t, then it will be equal to one for all the
later periods.
• Rjf ;1 and Rjf ;2 : Decision variables that represent the total time spent in each resource
j by flight f using the first and the second linear functions as described in Sect. 3.1.
• Yjf ;1 and Yjf ;2 : Binary variables that are used to link Rjf ;1 and Rjf ;2 to the first and
second linear functions respectively.
3.5 Bi-objective Mixed Integer Linear Programming Model
min C ¼ C1 þ C2 ð3Þ
Where
X X
f f
C1 ¼ Ca t akf wk;t wk;t1
f 2F
t 2 Tkf
k ¼ destf
X X ð3:1Þ
f f
Ca t dfk wk;t wk;t1
f 2F
t 2 Tkf
k ¼ orignf
X X
f f
C2 ¼ Cg t dfk wk;t wk;t1 ð3:2Þ
f 2F
t 2 Tkf
k ¼ orignf
!
XX
minE ¼ ECO2 F1 Djf Yjf ;1 TD S1 Rfj ;1 þ S1 r1 Djf Yjf ;1
f 2F j 2 P f
! ð4Þ
XX
þ F2 Djf Yjf ;2 þ TD S2 Rfj ;2 S2 r2 Djf Yjf ;2
f 2F j 2 P f
Subject to:
X
f f
wk;t wk;t1 Dk ðtÞ; 8k 2 K; t2T ð5Þ
f 2 F : k ¼ orignf
X
f f
wk;t wk;t1 Ak ðtÞ; 8k 2 K; t2T ð6Þ
f 2 F : k ¼ destf
X
wj;tf wjf0 ;t Sj ðtÞ; 8j 2 P f ; t 2 T ð7Þ
f 2 F : j 2 P f ; j 0 2 P f ð j þ 1Þ
f f
worign ;T f
¼ wdest ;T f
; 8f 2 F G ð8Þ
f orignf f destf
f
wj;t1 wj;tf 0; 8f 2 F ; j 2 K [ P f ; t 2 Tjf ð9Þ
wj;tf wjf0 ;tljj0 0; 8f 2 F ; t 2 Tjf ;

ð10Þ
j 2 P f [ destf : j 6¼ orignf : j0 ¼ P f ðj 1Þ
0
f
worign f ;t
wfk;tsf 0; 8ðf ; f 0 Þ 2 C; t 2 Torign
f
f
ð11Þ
X X
wj;tf wjf0 ;t ¼ Rfj ;1 þ Rfj ;2 ; 8j 2 P f ð12Þ
t2T f 2 F : j 2 P f ; j0 2 P f ðj þ 1Þ
Yjf ;1 þ Yjf ;2 ¼ 1; 8f 2 F ; j 2 P f ð13Þ
Djf r1 Yjf ;1 TD Rfj ;1 Djf ropt Yjf ;1 ; 8f 2 F ; j 2 P f ð14Þ
Djf ropt Yjf ;2 TD Rfj ;2 Djf r2 Yjf ;2 ; 8f 2 F ; j 2 P f ð15Þ
wj;tf 2 f0; 1g; 8f 2 F ; j 2 K [ P f ; t 2 T ð16Þ
Yf;1 f;2
j ; Yj 2 f0; 1g; Rf;1 f;2
j ; Rj 0 8f 2 F ; j 2 P f ð17Þ
The objective function (3) minimizes the total network delays that consist of the air
delays in Eq. (3.1), where the ground delays are subtracted from the total delays, and
the total ground delays in Eq. (3.2). The objective function (4) minimizes the total CO2
emissions as a function of the time needed to travel in each sector. Constraints (5)–(7)
are the network capacity constraints for the airport departure airport arrival and airspace
sector respectively. The capacity constraints ensure that at each period, the number of
flights does not exceed the resource capacity. Constraint (8) ensures that if a flight
departs, it will arrive within its allowable time. Constraint (9) and Constraint (10) are
the time and path connectivity constraints. Constraint (11) ensures that at least the
minimum turnaround time between any two connected flights is satisfied. Constraint
(12) defines the actual time spent in each sector by each flight and assigns it to either
Rfj ;1 or Rfj ;2 which is controlled by Constraints (13)–(15). Constraint (13) ensures that
either Yjf ;1 or Yjf ;2 is selected. Constraint (14) links Yjf ;1 with Rfj ;1 and to the first linear
line. Constraint (15) links Yjf ;2 with Rfj ;2 and to the second linear line. Constraints (16)
and (17) ensures that wj;tf ; Yjf ;1 ; Yjf ;2 are binary variables and Rfj ;1 and Rfj ;2 are positive
variables.
3.6 Solution Approach

To facilitate developing Pareto front, the weight comprehensive criterion method
(WCCM) is used. The WCCM is one of the scalarization techniques used to solve
multiple objective functions [19]. The WCCM combines all the objective functions
Copt : is the optimal total delay cost when the model is solved considering only objective
function (3) into one single objective that minimizes the total variation from the
optimal values of each objective function according to the following Equation:

C Copt E Eopt
min V ¼ a þb ; ð18Þ
Copt Eopt
where C and E correspond to the total delay cost and the total CO2 emissions objective
functions defined by Eqs. (3) and (4) respectively, Copt : is the optimal total delay cost
when the model is solved considering only Eq. (3). Eopt : is the optimal total CO2
emissions obtained by solving the model using the objective function defined in
Eq. (4). The parameters a and b are the total cost and the total CO2 emissions
importance values defined by decision-makers, where a þ b ¼ 1.Varying the values of
the importance weights provides Pareto solutions.
4 Delay and CO2 Emissions Trade-Off
In this section, we study the trade-off between minimizing the total delay cost and
minimizing the total CO2 emissions by presenting all the non-dominating Pareto
optimal solutions. To achieve this goal, we assume a grid of twenty-five airspace
sectors and fifteen airports (see Fig. 3). Two hundred flights are assigned to the airports
randomly, and their paths include all the sectors passing through the straight line
connecting their departure and arrival airports. Departing time for each flight is set
randomly, and the planning horizon is three and a half hours and each period accounts
for three minutes (TD ¼ 3) resulting in T ¼ 70 periods. In this example, the capacities
of the airspace sectors, airport departures and arrivals are assumed to be nine flights per
period. All flights are assumed to be performed using the same aircraft model due to
data availability; thus, the fuel consumption approximation functions described in
Sect. 3.1 is used in this example. The air delay (Ca ) and the ground delay (Cg ) costs are
348 and 270 €/period. The CO2 emissions factor (ECO2 ) is 2.527 kg/L.
Solving the model for the total delay cost function in (3) only subject to the
constraints (5)–(17) results in an optimal total delay cost (Copt ) of 24840 €. This
Fig. 3. The network used in the illustrative example. The twenty-five square grids represent the
airspace sectors. The fifteen circles represent the location of the fifteen airports.
solution produces a total of 12851533 kg of CO2. Then, solving the model for the total
CO2 emissions only (Eq. (4)) subject to constraints (5)–(17) results in an optimal
amount of CO2 emissions (Eopt ) of 12715558 kg but with a total delay cost of 191190
€. The difference between the two extreme cases are 166350 € and 135974 kg.
A Pareto front (see Fig. 4) was developed using Copt and Eopt and while solving the
model using Eq. (18) subject to constraints (5)–(17). The value of a was from 0 to 1
with an increment of 0.001. The value of b was calculated as 1 a. As can be noticed
from this Pareto set, on average, reducing 135974 kg of the emitted CO2 comes at a
delay cost of 166350 € (reducing 1 kg of CO2 costs 1.22 € of delays).
12860000
12840000
Total CO2 emissions [kg]
12820000
12800000
12780000
12760000
12740000
12720000
12700000
23000 43000 63000 83000 103000 123000 143000 163000 183000 203000
Total Delay Cost [€]
Fig. 4. Non-dominating Pareto optimal solutions
5 Conclusions
Despite its increasing importance, existing ATFM models with CO2 emissions are
limited. This paper presents two linear approximation functions for the fuel con-
sumption speed relation and includes it to an ATFM model under a bi-objective
configuration. The bi-objective model was solved using the WCCM technique. A nu-
merical example was developed to study the tradeoff between the network delays and
the CO2 emissions. Results showed that an average reduction in the emitted CO2 of
1 kg is equivalent to increasing the delay costs of 1.22 €. This tradeoff can be further
utilized in developing emission reduction policies and strategies that take into con-
sideration the network delays. As a future work, using additional datasets, a better
approximation can be found for the function relating the fuel consumption to the speed.
Climbing and landing emissions functions can be developed if more data can be
available and the different landing techniques can be studied and compared.
Acknowledgement. The authors would like to thank Prof. Ali Akgunduz, Professor of
Mechanical, Industrial and Aerospace Engineering at Concordia University – Canada, for pro-
viding the fuel consumption data used in this study.
This work was supported by the University of Sharjah [grant number 1702040585].
References
1. Dekker, R., Bloemhof, J., Mallidis, I.: Operations Research for green logistics - an overview
of aspects, issues, contributions and challenges. Eur. J. Oper. Res. 219, 671–679 (2012)
2. Gössling, S., Broderick, J., Upham, P., Ceron, J.P., Dubois, G., Peeters, P., Strasdas, W.:
Voluntary carbon offsetting schemes for aviation: efficiency, credibility and sustainable
tourism. J. Sustain. Tour. 15, 223–248 (2007)
3. European Commission: Reducing emissions from aviation. https://ec.europa.eu/clima/
policies/transport/aviation_en. Accessed 4 Feb 2019
4. Hayward, J.A., O’Connell, D.A., Raison, R.J., Warden, A.C., O’Connor, M.H., Murphy, H.
T., Booth, T.H., Braid, A.L., Crawford, D.F., Herr, A., Jovanovic, T., Poole, M.L.,
Prestwidge, D., Raisbeck-Brown, N., Rye, L.: The economics of producing sustainable
aviation fuel: a regional case study in Queensland Australia. GCB Bioenergy 7, 497–511
(2015)
5. Clarke, J.-P., Lowther, M., Ren, L., Singhose, W., Solak, S., Vela, A., Wong, L.: En route
traffic optimization to reduce environmental impact. http://web.mit.edu/aeroastro/partner/
reports/proj5/proj5-enrouteoptimiz.pdf (2008)
6. Bertsimas, D., Patterson, S.S.: The air traffic flow management problem with enroute
capacities. Oper. Res. 46, 406–422 (1998)
7. Bertsimas, D., Patterson, S.S.: The traffic flow management rerouting problem in air traffic
control: a dynamic network flow approach. Transp. Sci. 34, 239–255 (2000)
8. Lulli, G., Odoni, A.: The European air traffic flow management problem. Transp. Sci. 41,
431–443 (2007)
9. Agustín, A., Alonso-Ayuso, A., Escudero, L.F., Pizarro, C.: On air traffic flow management
with rerouting. Part I: deterministic case. Eur. J. Oper. Res. 219, 156–166 (2012)
10. Agustín, A., Alonso-Ayuso, A., Escudero, L.F., Pizarro, C.: On air traffic flow management
with rerouting. Part II: stochastic case. Eur. J. Oper. Res. 219, 167–177 (2012)
11. Diao, X., Chen, C.H.: A sequence model for air traffic flow management rerouting problem.
Transp. Res. Part E Logist. Transp. Rev. 110, 15–30 (2018)
12. Mukherjee, A., Hansen, M.: A dynamic rerouting model for air traffic flow management.
Transp. Res. Part B Methodol. 43, 159–171 (2009)
13. Andreatta, G., Dell’olmo, P., Lulli, G.: An aggregate stochastic programming model for air
traffic flow management. Eur. J. Oper. Res. 215, 697–704 (2011)
14. Bertsimas, D., Gupta, S.: Fairness and collaboration in network air traffic flow management:
an optimization approach. Transp. Sci. 50, 57–76 (2015)
15. Chen, J., Cao, Y., Sun, D.: Modeling, optimization, and operation of large-scale air traffic
flow management on spark. J. Aerosp. Inf. Syst. 14, 504–516 (2017)
16. Hamdan, S., Cheaitou, A., Jouini, O., Jemai, Z., Alsyouf, I., Bettayeb, M.: On fairness in the
network air traffic flow management with rerouting. In: 2018 9th International Conference on
Mechanical and Aerospace Engineering (ICMAE), pp. 100–105. IEEE, Budapest, Hungary
(2018)
17. Hamdan, S., Cheaitou, A., Jouini, O., Jemai, Z., Alsyouf, I., Bettayeb, M.: An environmental
air traffic flow management model. In: 2019 8th International Conference on Modeling,
Simulation, and Applied Optimization (ICMSAO). IEEE, Bahrain (2019)
18. Akgunduz, A., Kazerooni, H.: A non-time segmented modeling for air-traffic flow
management problem with speed dependent fuel consumption formulation. Comput. Ind.
Eng. 122, 181–188 (2018)
19. Hamdan, S., Larbi, R., Cheaitou, A., Alsyouf, I.: Green Traveling purchaser problem model:
a bi-objective optimization approach. In: 2017 7th International Conference on Modeling,
Simulation, and Applied Optimization, ICMSAO 2017. IEEE, United Arab Emirates (2017)
Scheduling Three Identical Parallel
Machines with Capacity Constraints
Jian Sun1 , Dachuan Xu1 , Ran Ma2 , and Xiaoyan Zhang3(B)

1
Department of Information and Operations Research, College of Applied
Mathematics, Beijing University of Technology,
Beijing 100124, People’s Republic of China
B201806011@emails.bjut.edu.cn, xudc@bjut.edu.cn
2
School of Mathematics and Information Science, Henan Polytechnic University,
Jiaozuo 454000, People’s Republic of China
sungirlmr@hpu.edu.cn
3
School of Mathematical Science & Institute of Mathematics,
Nanjing Normal University, Jiangsu 210023, People’s Republic of China
zhangxiaoyan@njnu.edu.cn
Abstract. In many flexible manufacturing systems, it is quite impor-

tant to balance the number of jobs allocated to each single production
facility. Yang, Ye and Zhang (2003) considered the problem that sched-
ules n jobs on two identical parallel machines, with a capacity constraint
on each machine, i.e. the number of jobs that each machine can process is
bounded, so as to minimize the total weighted completion time of these
jobs by semidefinite programming relaxation. In this paper, we further
consider the problem of three identical parallel machines with capac-
ity constraints and present a 1.4446-approximation algorithm based on
complex semidefinite programming relaxation by extending the previous
techniques.
Keywords: Approximation algorithm · Parallel machine scheduling

with capacity constraints · Complex semidefinite programming
1 Introduction
In the unrelated parallel machine scheduling problem, we are given a set of n
jobs J = {1, 2, · · · , n} and m parallel machines. Each job Jj has a positive pro-
cessing time pij depending on machine i and must be processed for the respective
amount of time on one of the m machines, and may be assigned to any of them.
Every machine can process at most one job at a time. The completion time of job
Jj in a schedule
is denoted by Cj . We aim to minimize the total weighted com-
pletion time j∈J wj Cj where wj denotes a given nonnegative integral weight
of job Jj which is a measure for its importance. For the sake of convenience,
Supported by Natural Science Foundation of China(Grant Nos. 11531014, 11871081,
11871280, 11471003, 11501171) and Qinglan Project.
https://doi.org/10.1007/978-3-030-21803-4_107
1090 J. Sun et al.

we denote this problem as Rm|ri,j | wj Cj where ri,j is often called as the job
j’s arrival/release time on machine i. The special case of this problem, identi-
i.e. pij = pj holds for each job j and all machines i is
cal parallel scheduling,
denoted by P m|ri,j | wj Cj . When machines are identical, uniformly related,
or a special case of unrelated machines, PTASes are known [1,4,11].
For the case of m ≥ 1 and ri,j = 0, Skutella [9] presented a 32 -approximation
algorithm in 1998. Recently, for a small constant > 0, Bansal et al. [3] gave a
( 32 − )-approximation algorithm improving upon the natural barrier of 32 which
follows from independent randomized rounding. In simplified terms, their result
was obtained by an enhancement of independent randomized rounding via strong
negative correlation properties. In 2017, Kalaitzis et al. [7] took a different app-
roach and proposed to use the same elegant rounding scheme for the weighted
completion time objective as devised by Shmoys and Tardos [8] for optimizing
a linear function subject to makespan constraints. Their main result is a 1.21-
approximation algorithm for the natural special case where the weight of a job
is proportional to its processing time (specifically, all jobs have the same Smith
ratio), which expresses the notion
that each unit of work has the same weight.
For the problem Rm|ri,j | wj Cj , Skutella gave a 2-approximation algorithm
in 2001 [10]. It has been a long standing open problem if one can improve upon
this 2-approximation. Im and Li answered this question in the affirmative by
giving a 1.8786-approximation [6].
Most of the parallel machine scheduling models assume that each machine
has no capacity constraints, which means every machine can process an arbi-
trary number of jobs. In general, it is quite important to balance the number of
jobs allocated to each single production facility in many flexible manufacturing
systems, for example, in VLSI chip production.
For the case of two identical parallel machines with capacity constraint, Yang,
Ye and Zhang [12] presented a 1.1626-approximation algorithm which has the
first non-trivial ratio to approximate this problem (m = 2) by semidefinite pro-
gramming relaxation. In this paper, we extend the techniques in [9,12] to com-
plex semidefinite programming and approximate the problem when m = 3 with
performance ratio of 1.4446.
The rest of the paper is organized as follows.
In Sect. 2 we introduce the
approximation preserving reduction of P 3|q| wj Cj to Max-(q; q; n-2q) 3-Cut
and present the translated guarantee. Then, in Sect. 3 we develop a CSDP-based
approximation algorithm for Max-(q; q; n-2q) 3-Cut and present our main results.
2 Preliminaries
In this section, we will show the translation of the approximation algo-

for Max-(q; q; n − 2q) 3-Cut problem to an approximation algorithm for
rithm
P 3|q| wj Cj . This translation was first given by Skutella [9].
We denote the third roots of unit by 1, ω and ω 2 . Let G = (V ; E) be an
undirected graph with vertex set V = {1, 2, · · · , n} and non-negative weights
wij = wji on the edges (i; j) ∈ E, the Max-(q; q;n-2q) 3-Cut problem is to find
Scheduling Three Identical Parallel Machines with Capacity Constraints 1091
a partition S = {S1 , S2 , S3 } of G maximizing the total weight of cut edges that

satisfies the constraints n−2q ≤ |Si | ≤ q for i = 1, 2, 3, where n = |V |, n3 ≤ q ≤ n
and Sk = {i : yi = ω k−1 }, k = 1, 2, 3. When q = n3 , it is just the Max 3-Section
problem.
In Max-(q; q; n-2q) 3-Cut problem, we require a cut that the cardinality of
each part of the partition is not greater than q. We may suppose that |S1 | =
x, |S2 | = y and then |S3 | = n − x − y. Thus | j yj | = |x + yω + (n − x − y)ω 2 | =
√
| 3x−n
2 + 23i (x + 2y − n)| ≤ 3q − n.
Then we have that Max-(q; q; n-2q) 3-Cut can be relaxed as follows (Mq3C):
2
w∗ := max wij (1 − Re(yi · yj ))
3 i<j
subject to :

| yj | ≤ 3q − n,
j
yj ∈ {1, ω, ω 2 }, j = 1, 2, · · · , n.

The underlying intuition for the reduction from P 3|q| wj C j to Max-(q; q; n −
2q) 3-Cut is as following. A solution for any instance of P 3|q| wj Cj can be seen
as a two-phases schedule: first assigning the jobs to one of the three machines;
then sequencing the jobs on each machine. It has been proved that once the
jobs are assigned they must be sequenced in the non-descending order of pj /wj .
We say i ≺ j if i = j and pi /wi ≤ pj /wj . Therefore, if i ≺ j, and i and j are
assigned to thesame machine, then i should always be processed earlier than
j. Then P 3|q| wj Cj is simply a partition of the n jobs. Now we consider the
jobs assigned to one of the machines, say jobs 1, 2, · · · , s. The total weighted
completion time of the s jobs is

s
s
wj Cj = wj pj + wj pi . (1)
j=1 j=1 i≺j
If the graph is defined on n nodes which correspond to the n jobs, then the
assignment of the n jobs can be seen as dividing the vertex n set into three sub-
sets. Furthermore, the total weighted completion time j=1 wj Cj will be the
n
total weight of edges within each of the three sub-graphs plus j=1 wj pj . Then,
n
roughly speaking, minimizing j=1 wj Cj is equivalent to minimizing the total
weight of edges within each of the three sub-graphs or to maximizing the total
weight of edges across the three sub-graphs.
We associate each instance of P 3|q| wj Cj with a complete undirected graph
G = (V, E) on the vertex set V = {1, 2, · · · , n} that corresponds to the job set
{J1 , J2 , · · · , Jn }; and the weight wij of the edge (i, j) ∈ E given by
wij = min{wi pj , wj pi }. (2)
Each partition S = {S1 , S2 , S3 } of the vertex set V can be interpreted as a
feasible schedule of the n jobs on three machines. Note that |Si | ≤ q (i = 1, 2, 3)
1092 J. Sun et al.
corresponds to the capacity constraints that at most q jobs can be processed on

each machine. Moreover, the value of a feasible schedule can be represented by
the total weights of theedges with both endpoints in the same vertex subset
n
plus the constant term j=1 wj pj . Particularly, we get

n
n
wT O + wj pj = wj Cj + w(S1 , S2 , S3 ), (3)
j=1 j=1
where Cj is the completion time of job Jj in the schedule corresponding to the
partition S = {S1 , S2 , S3 }; wT O = i<j wij denotes the total weights of all the
edges of G; and w(S1 , S2 , S3 ) denotes the cut value ofpartition S = {S
1 , S2 , S3 }.
n
It is worth noting that for any given instance P 3|q| wj Cj , wT O + j=1 wj pj
is a constant.
n Therefore, by equality (3), minimizing the total weighted comple-
tion time j=1 wj Cj is equivalent to maximizing the cut value w(S1 , S2 , S3 ).
Let the minimum value of P 3|q| wj Cj be Z ∗ and the maximum value of the
corresponding Max-(q; q;n-2q) 3-Cut be w∗ . By generalizing the technique, we
have the following result of this section.
Lemma 1. For any ρ(k) ≤ 1, a ρ(k)-approximation algorithm for Max-(q; q; n−
2q) 3-Cut can be translated to an algorithm for P 3|q| wj Cj with a performance
guarantee of 1 + (1 − ρ(k))/(2 − k).
3 The Scheduling Algorithm and Our Main Result

In thissection, we will present the scheduling algorithm (Algorithm 1) for the
P 3|q| wj Cj problem and show our main results. The complex semidefinite
programming relaxation of (Mq3C) is:
2
wCSDP := max wij (1 − Re(yi · yj ))
i<j
3
s.t. ||yi || = 1, ∀i ∈ V (4)
Y 0,
Akij · Y ≥ −1, i, j = 1, 2, · · · , n, k = 0, 1, 2

Yij ≤ (3q − n)2 ,
i,j
where Yij = (yi · yj ), Akij = ω k ei · eT j + ω

−k
ei · eT
j .
In Algorithm 1, without loss of generality, we may suppose that |S1 | ≥
|S2 | ≥ |S3 |. I indicates the identity matrix. The algorithm uses the same round-
ing technique which refines the one in [5]. Now, we construct the function
rebalance(S). The detailed size-adjustment operation of function rebalance(S)
is as follows:
Initialize Ŝl = Sl (l = 1, 2, 3) and denote the final partition as S̃ = {S˜1 , S˜2 , S˜3 }.
1. If |S1 | ≥ |S2 | ≥ q, then iteratively, perform the following operations (i)-(ii)
until |Ŝl | = q for each (l = 1, 2):
Algorithm 1 Algorithm Scheduling Partition

Input: A set of n jobs J, and processing time set P .
Output: A three-partition of J.
1: Translate the Scheduling problem to Max-(q; q; n − 2q) 3-Cut problem;
2: Solve the CSDP problem and let Y be the solution matrix;
3: fix a value θ with 0 ≤ θ < 1;
4: let Y = θY + (1 − θ)I
5: S̃ = ∅;
6: generate a vector ξ ∼ N (0, Ŷ );
7: compute Arg(ξi ), i = 1, 2, · · · , n;
8: assign yi for i = 1, 2, · · · , n to get a vector y ∈ {1, ω, ω 2 }n
identifying a cut S = {S1 , S2 , S3 };
9: if |S1 | ≤ q and |S2 | ≤ q; /∗ the cut is feasible for Mq3C∗ /
let S̃ = S else let S̃ = rebalance(S);
10: return S̃;
11: Assign the jobs to three machines according to the partition.

(i) Sort the vertices in Ŝl such that δ(i1 ) ≥ · · · δ(i|Ŝl | ) where δ(i) = / Sˆ1l
j∈ wij
where (i ∈ Ŝl ).
(ii) Move the point i|Ŝl | from Ŝl to Sˆ3 , namely Ŝl = Ŝl \ {i|Ŝl | } and Sˆ3 =
Sˆ3 ∪ {i }.
|Ŝl |
2. If |S1 | ≥ q ≥ |S2 |, then iteratively, perform the following operations (i)–(ii)
until |Sˆ1 | = q and |Ŝl | ≤ q for each l = (2, 3):
(i) Sort the vertices in Sˆ1 such that δ(i1 ) ≥ · · · δ(i|Sˆ1 | ) where δ(i) = j ∈/ Sˆ1 wij
where (i ∈ Sˆ1 ).
(ii) Move the point i|Sˆ1 | from Sˆ1 to Ŝl , namely Sˆ1 = Sˆ1 \ {i|Sˆ1 | } and Ŝl =
Ŝl ∪ {i|Sˆ1 | }.
We are now ready to present our main results.
For the sake of analyzing the solution returned by the algorithm, we define
a real function:
9 1
f (x) = (arccos2 (−x) − arccos2 ( x)), x ∈ [−1, 1].
8π 2 2
For a given 0 ≤ θ ≤ 1, let
1 − f (θx)
α(θ) = min ,
− 12 ≤x<1 1−x
b(θ) = 1 − f (θ),
f (θ) − f (θx)
c(θ) = min .
− 12 ≤x<1 1−x
Then we let
2k
d(θ) = max{α(θ), b(θ) + c(θ)},
3
1094 J. Sun et al.
and for q ∈ [ n3 , 2n
3 ),
n2 − n
β(θ) = b(θ) + c(θ).
3q(2n − 3q)
Then, we have
Lemma 2. For any given θ ∈ [0, 1] and q ∈ [ n3 , 2n
3 ), Algorithm 1yields S at
line 8 satisfying the following inequalities
E[M ] ≥ β(θ)M ∗ ,
where M = [S1 ][S2 ] + [S1 ][S3 ] + [S2 ][S3 ] and M ∗ = q(2n − 3q)
E[w(S)] ≥ d(θ)w∗ .
We construct the following artificial random variable. For a given γ ≥ 0, define

w(S) M
Z= ∗
+ γ ∗. (5)
w M
And we have that
E[Z] ≥ d(θ) + γβ(θ). (6)
Hence variable Z is bounded above, so that for any > 0, we have that Algorithm
Scheduling Partition generates a cut S for which
Z ≥ [d(θ) + γβ(θ)](1 − ). (7)
Theorem 1. For any γ > 0, if random variable Z satisfies inequality (7), then
for the corresponding cut S̃ computed by Algorithm 1, we have
w(S̃) ≥ g · w∗ , (8)
where
γ2
g=γ− . (9)
[d(θ) + γβ(θ)](1 − )
|Si |
Proof. When the algorithm finds a cut S satisfying inequality (7), we let xi = n
for i = 1, 2, 3 and λ = w(S)
w∗ . Then we have:
λ ≥ [d(θ) + γβ(θ)](1 − ) − 3γ[x1 x2 + (x1 + x2 )(1 − x1 − x2 ]. (10)
There are two cases for the cut S̃, either S̃ = rebalance(S) or S̃ = S. In the first
case, it is easy to see that w(S̃) ≥ 3x1 1 3x1 2 λw∗ , whereas in the second case we
can obviously have that w(S̃) ≥ λw∗ . Hence w(S̃) ≥ 3x1 1 3x1 2 λw∗ , then using the
inequality for λ, we get
w(S̃) ≥ f · w∗ , (11)
where
(1 − ) x1 x2 + (x1 + x2 )(1 − x1 − x2 )
f = [d(θ) + γβ(θ)] −γ . (12)
9x1 x2 3x1 x2
In order to simplify the above equality and to remove the dependence on x =
(x1 , x2 ), we consider function f for x1 , x2 ≥ 0. It is easy to calculate that f gets
a minimum at x = (x∗1 , x∗2 ) where x∗1 = x∗2 = [d(θ)+γβ(θ)](1−)
3γ , and it gets the
γ2
value [γ − [d(θ)+γβ(θ)](1−) ] which is, by definition the value of function g.
It is easily seen that the function g is concave and has a maximum at
d(θ) 1
γ= ( − 1) (13)
β(θ) 1 − β(θ)(1 − )
For the sake of analysis being simple, we may suppose that = 0 in the rest
proof. In the worst case, i.e. q = n3 , we have the following result:
Lemma 3. Algorithm 1, with probability almost 1, generates a schedule for
P 3| n3 | wj Cj (the most capacitated case) whose performance guarantee is
1.4446.
Proof. Let
d(θ) 1
γ= ( − 1),
β(θ) 1 − β(θ)
we have
d(θ)
ρ(k) ≥ .
(1 + 1 − β(θ))2

Consider the maximal translated guarantee for P 3| n3 | wj Cj .
1 − ρ(k)
max3 1 +
k∈[1, 2 ] 2−k
⎧ 2k ⎫
⎪
⎨ 1− √3
b+c
2 1− √α(θ) 2
⎪
⎬
(1+ 1−β(θ)) (1+ 1−β(θ))
= max3 min 1 + ,1 + . (14)
k∈[1, 2 ] ⎪
⎩ 2−k 2−k ⎪
⎭
2k b+c
1− √3 2
(1+ 1−β(θ))
One can easily verify that 1+ 2−k is a decreasing function of k ∈ [1, 32 ],
α(θ)
1− √ 2
(1+ 1−β(θ))
and 1 + 2−k is an increasing function of k ∈ [1, 32 ]. Then we can see
that the maximum value of equality (14) occurs at two possible points: k =
3(α(θ)−c(θ))
2b(θ) if 1 ≤ 3(α(θ)−c(θ))
2b(θ) ≤ 32 , where the two functions have the identical
3(α(θ)−c(θ))
value; or k = 3
2 if 2b(θ) ≥ 32 . In this particular case
3(α(θ) − c(θ))
k=
2b(θ)
yields the maximal value less than 1.4446, for sufficiently large n.
1096 J. Sun et al.
Since it is easy to verify the approximation ratio is always better than that of
the worst case (q = n3 ) which is 1.4446, we have

Theorem 2. Algorithm 1 generates a schedule for P 3|q| wj Cj ( n3 ≤ q ≤ n)
whose performance guarantee is 1.4446 with probability almost 1.
4 Conclusions
In this paper, we have presented a CSDP-based approximation algorithm for
scheduling on three identical parallel machines with capacity on each machine.
It is still open to ask whether this approach could be applied to approximating
the more general scheduling on m machines with capacity constraints.
References
1. Afrati, F., Bampis, E., Chekuri, C., Karger, D. et al.: Approximation schemes for
minimizing average weighted completion time with release dates. In: Proceedings
of the 40th Annual IEEE Symposium on Foundations of Computer Science, pp.
32–43 (1999)
2. Andersson, G.: An approximation algorithm for max p-section. In: Meinel, C.,
Tison, S. (eds.) STACS 1999. LNCS, vol. 1563, pp. 237–247
3. Bansal, N., Srinivasan, A., Svensson, O.: Lift-and-round to improve weighted com-
pletion time on unrelated machines. In: Proceedings of the 48th Annual ACM
Symposium on Theory of Computing, pp. 156–167 (2016)
4. Chekuri, C., Khanna, S.: A PTAS for minimizing weighted completion time on
uniformly related machines. In: Proceedings of 28th International Colloquium on
Automata, Languages, and Programming, pp. 848–861. Springer, Berlin (2001)
5. Goemans, M.X., Williamson, D.P.: Approximation algorithms for MAX-3-CUT
and other problems via complex semidefinite programming. J. Comput. Syst Sci.
68, 442–470 (2004)
6. Im, S., Li, S.: Better unrelated machine scheduling for weighted completion time
via random offsets from non-uniform distributions. In: Proceedings of the 57th
Annual Symposium on Foundations of Computer Science, pp. 138–147 (2016)
7. Kalaitzis, C., Svensson, O., Tarnawski, J.: Unrelated machine scheduling of jobs
with uniform smith ratios. In: Proceedings of the 28th Annual ACM-SIAM Sym-
posium on Discrete Algorithms, pp. 2654–2669 (2017)
8. Shmoys, D.B., Tardos, É.: An approximation algorithm for the generalized assign-
ment problem. Math. Program. 62(1–3), 461–474 (1993)
9. Skutella, M.: Semidefinite relaxations for parallel machine scheduling. In: Proceed-
ings of the 39th Annual IEEE Symposium on Foundations of Computer Science,
pp. 472–481 (1998)
10. Skutella, M.: Convex quadratic and semidefinite programming relaxations in
scheduling. J. ACM 48(2), 206–242 (2001)
11. Skutella, M., Woeginger, G.J.: A PTAS for minimizing the total weighted comple-
tion time on identical parallel machines. Math. Oper. Res. 25(1), 63–75 (2000)
12. Yang, H., Ye, Y., Zhang, J.: An approximation algorithm for scheduling two parallel
machines with capacity constraints. Discret. Appl. Math. 130(3), 449–467 (2003)
Solving the Problem of Coordination
and Control of Multiple UAVs by Using
the Column Generation Method
Duc Manh Nguyen1(B) , Frédéric Dambreville2 , Abdelmalek Toumi2 ,

Jean-Christophe Cexus2 , and Ali Khenchaf2
1
Hanoi National University of Education, 136 Xuan Thuy, Cau Giay,
Hanoi, Vietnam
nguyendm@hnue.edu.vn
2
Lab-STICC UMR CNRS 6285, ENSTA Bretagne, 2 rue Francois Verny,
29806 Brest Cedex 9, France
{dambrefr,toumiab,cexusje,ali.khenchaf}@ensta-bretagne.fr
Abstract. In this paper, we consider the problem of autonomous task

allocation and trajectory planning for a set of UAVs. This is a bi-level
problem: the upper-level is a task assignment problem, subjected to UAV
capability constraints; the lower-level constructs the detailed trajectory
of UAVs, subjected to dynamics, avoidance and dependency constraints.
Although the entire problem can be formulated as a mixed-integer lin-
ear program (MILP), and thus it can be solved by available software, the
computational time increases intensively. For solving more efficiently this
problem we propose an efficient approach based on the column genera-
tion method in which the modified dependency constraint will be added
into the sub-problem. The performance of our approach is evaluated by
comparing with solution given by the CPLEX on different scenarios.
Keywords: Unmanned aerial vehicle · Task allocation ·

Trajectory design · Column generation ·
Mixed integer linear programming
1 Introduction
UAVs (unmanned aerial vehicles) nowadays can be used for various civilian and
military tasks. There has been considerable interest in making these unmanned
vehicles completely autonomous, giving rise to the research area of UAVs. These
are usually seen as rather simple vehicles, acting cooperatively in teams to accom-
plish difficult missions in dynamic, poorly known or hazardous environments
[2–4,9,10,12].
In this paper, we consider a UAV coordination problem that can be described
as follows: we have a fleet of UAVs, a set of waypoints (i.e., missions), and
obstacles (i.e., No-Fly-Zones). The objective is to design the trajectories for

https://doi.org/10.1007/978-3-030-21803-4_108
1098 D. M. Nguyen et al.
UAVs visiting the waypoints while avoiding the No-Fly-Zones in order to maxi-
mize the overall performance. There are several constraints in our problem, such
as capability constraints, capacity constraints, dynamics constraints, avoidance
constraints and dependency constraints. In fact, our considered problem orig-
inates from the context presented in the papers [2,9]. The difference is that
instead of just minimizing the completion time, our objective is to maximize
the overall performance. The reason to chose this objective function is that per-
forming missions is usually the most important thing in military [5,6,8,11]; and
sometimes because the set of UAVs is smaller than the set of missions, thus one
should choose some missions with higher priority not all to perform. Actually,
this is an extension of our previous work [8]. The problem can be reformulated as
a mixed-integer linear programming (MILP) as in [2,9] and thus it can be solved
by some available solvers. However the computational time increases dramati-
cally in this approach. Another one is an approximate method that simplifies the
coupling between the assignment and trajectory design problems by calculating
and communicating only the key information that connects them [2].
In this work, we investigate an proficient approach based on the column gen-
eration method [1,7] for solving this problem. The column generation is nowa-
days a prominent method to cope with a huge number of variables, numerous
applications based on column generation have been developed [7]. In applica-
tions, constraint matrices of (integer) linear programs are typically sparse and
well structured. Subsystems of variables and constraints appear in independent
groups, linked by a distinct set of constraints and/or variables. We will refor-
mulate this problem in the form of column generation, where the dependency
constraint is handled in the Master Problem (MP). Since the effect of the dual
variables is not enough strong to drive the sub-problem towards trajectories that
lead to feasible solutions, we propose a modified dependency constraint for the
sub-problem to prevent it from generating the infeasible trajectories.
The rest of paper is organized as follows. In Sect. 2, we introduce the con-
sidered UAV coordination problem and its formulation. Our column generation
approach for solving this problem is presented in Sect. 3. Numerical experiments
are reported in Sect. 4 while some conclusions and perspectives are discussed in
Sect. 5.
We report below the problem statement described in [2].

Let there be NV vehicles, NW waypoints (corresponding to missions) and
NZ No-Fly-Zones. A total of NT time steps are used for planning, although the
mission will typically not require all of this time.
The UAVs are modeled as points mass moving in two dimensions with limited
speed and turning rate. We denote by vmax,p and ωp the maximum speed and
turning rate of the vehicle p respectively. Initial states are also included in a
matrix S such that the row p is the initial state vector (x0 , y0 , ẋ0 , ẏ0 ) of the
vehicle p, forming an NV × 4 matrix.
Column Generation Method for Coordination and Control of Multiple UAVs 1099
The waypoints are specified by the NW × 2 matrix W where (Wi1 , Wi2 ) is

the position of the waypoint i.
The No-Fly-Zones are modeled as rectangles and are defined by the NZ × 4
matrix Z where (Zj1 , Zj2 ) is the bottom left vertex of the No-Fly-Zone j and
(Zj3 , Zj4 ) is the top right vertex.
The vehicle capabilities are included in the matrix K, in which Kpi = 1 if
the vehicle p can visit the waypoint i and 0 otherwise. The matrix K has size
NV × NW .
Time dependencies, forcing one waypoint to be visited after another, sepa-
rated by some interval, are included in the matrix D. Each row Dk = (ik , jk ) of
the matrix represents a time dependency of two missions ik and jk , where mis-
sion ik has to be visited after the mission jk plus tDk time units. Thus if there are
ND time dependencies, the size of matrix is ND × 2. The corresponding element
in the vector tD is the interval between the two visits.
The gains are given in the NW × NV matrix G in which gip is the gain of
vehicle p when visiting the waypoint i.
Therefore, in summary, the problem can be completely specified by the vec-
tors and matrices denoted by
vmax , ω, S, W, Z, K, D, G, tD .
Our considered problem can be reformulated as an MILP as follows [2].
2.1 The Model of Aircraft Dynamics
Each aircraft p is modeled as a point mass mp moving in 2-D. Let the position
of aircraft p at time-step t be given by (xtp , ytp ) and its velocity by (ẋtp , ẏtp ),
forming the elements of the state vector stp . The aircraft is assumed to be acted
y
upon by control forces (ftp x
, ftp ) in the X and Y directions respectively, forming
the force vector ftp .
The maximum speed vmax,p is enforced by an approximation to a circular
region in the velocity plane given by Eq. (1)

2πh 2πh
ẋtp sin + ẏtp cos ≤ vmax,p , (1)
NC NC
∀t = 1, ..., NT , ∀p = 1, ..., NV , ∀h = 1, ..., NC ,
where NC is the order of discretization of the circle. The maximum turning

rate is enforced by limiting the force magnitude, using another circular region
approximation given by Eq. (2)

2πh y 2πh
ftp
x
sin +ftp cos ≤ fmax,p , (2)
NC NC
∀t = 0, ..., NT − 1, ∀p =1, ..., NV , ∀h = 1, ..., NC .
where fmax,p is related to the maximum turn rate, for travel at constant speed
vmax,p , by
fmax,p
ωp = · (3)
mp .vmax,p
The discretized dynamics of the overall system, applied to all NV vehicles
up to NT time-steps, can be written in the linear form
s(t+1)p = Astp + Bftp , (4)

∀t = 0, ...,NT − 1, ∀p = 1, ..., NV ,
where A and B are the system dynamics matrices for a unit point mass. In all
cases, the initial conditions are specified from the initial condition matrix S.
The constraints for avoiding a rectangular obstacle1 were developed in [10]
and can be written as
xtp − Zj3 ≥ −Rcjpt1 , Zj1 − xtp ≥ −Rcjpt2 ,

ytp − Zj4 ≥ −Rcjpt3 , Zj2 − xtp ≥ −Rcjpt4 , (5)

4
cjptz ≤ 3,∀t = 1, ..., NT , ∀p = 1, ..., NV , ∀j = 1, ..., NZ ,
z=1
where cjptz is a set of binary decision variables and R is a positive number that is
much larger than any position to be encountered in the problem. If cjptz = 0, the
vehicle p is clear of the No-Fly-Zone j in the direction z (of the four directions
+X, −X, +Y, −Y ) at the time t step. If cjpkz = 1, the constraint is relaxed. The
final inequality ensures that no more than three of the constraints are relaxed
at any time-step, so the vehicle must be clear of the obstacle in at least one
direction.
2.2 Assignment and Dependency
The set of constraints which state that a vehicle visits a waypoint is
xtp − Wi1 ≤ R(1 − bipt ), xtp − Wi1 ≥ −R(1 − bipt ),

ytp − Wi2 ≤ R(1 − bipt ), ytp − Wi2 ≥ −R(1 − bipt ), (6)
∀t = 1, ..., NT , ∀p = 1, ..., NV , ∀i = 1, ..., NW ,
where bipt is a binary decision variable, W is the waypoint location matrix, and
R is the same large, positive number used in (5). It can be seen that bipt = 1
implies that vehicle p visits waypoint i at time-step t.
While the logical constraint enforces that each waypoint without depen-
dency is visited at most once by a vehicle with suitable capabilities, the time
1
The generalization to a polygonal obstacle is easy.
dependencies enforces each waypoint with dependency must be visited exactly

once. We have the following constraints

NT
NV
NT
NV
Kpi bipt ≤ 1, ∀i ∈ NW \D, Kpi bipt = 1, ∀i ∈ D,
t=1 p=1 t=1 p=1

NT
NV
NT
NV
tbik ,p,t − tbjk ,p,t ≥ tDk , ∀k = 1, ..., ND . (7)
t=1 p=1 t=1 p=1
We also consider the constraint of resource

T −1
N
y
x
(|ftp | + |ftp |) ≤ Fp , ∀p = 1, ..., NV , (8)
t=0
where Fp is the capacity of the vehicle p.

The flight completion time tp for the vehicle p, which is the time it visits its
last waypoint

NT
tp ≥ tbipt , ∀p = 1, ..., NV , ∀i = 1, ..., NW . (9)
t=1
The objective is to maximize the total gain and also minimize the imple-
mentation time and the resource expense. Here the gain is more important than
the others, and the implementation time is more important than the resource
expense. Thus, we have the following optimization problem
⎧
⎪
⎨
T N
N W N
V
N V N
T −1
y
max gip bipt − 1 tp + 2 |ftp | + |ftp |
x
s,f,b,c t=1 i=1 p=1 (10)
⎪
⎩
p=1 t=0
subject to: theconstraintsfrom(1)to(9).
In this model, the parameters 1 , 2 > 0 are quite small to represent the
importance of the gain, the implementation time and the resource expense. This
is a linear mixed 0–1 programming problem which allows us to use available
solvers to find the optimal solution, but the computation can be intensive. The
next section is devoted to the description of a column generation approach for
solving the problem (10).
3 A Column Generation Approach
For using column generation, we will reformulate the problem given in Eq. (10)
in the form of column generation, where the dependency constraint is handled in
the Master Problem. The sub-problem is then treated with careful consideration
of the impact of dependency constraint.
3.1 Reformulate the Problem
The column generation approach is based on the notion of feasible trajectories

for the UAVs. A feasible trajectory of a vehicle p is a trajectory starting from its
departure, satisfying all moving constraints and visiting at least one waypoint
i. Wedenote by Ωp the set of all feasible trajectories for the vehicle p, and
Ω = p∈NV Ωp the set of all feasible trajectories.
Let r = (r1 , r2 , ..., rNT ) ∈ Ωp ⊂ Ω be a trajectory, where rt is the state of
the vehicle p at time t. The performance of this trajectory, denoted by g(r), is
computed as follows

g(r) = gip − 1 (tpr + 2 frp ) .
i∈NW ∩r

In this formula, i∈NW ∩r gip is the gain of the trajectory r, tpr is the time when
the vehicle p visits its last waypoint in the trajectory r, and frp is the total force
used through the trajectory r.
We define the parameter of visitation ari for each waypoint i ∈ NW by

1 if r visits waypoint i,
ari =
0 otherwise.
When the trajectory r visits the waypoint i, the time of visitation is denoted
by tr (i). Note that each couple of dependency waypoints is represented as Dk =
(ik , jk ), where waypoint ik must be visited after the waypoint jk plus a tDk time
units. Thus, the problem (10) can be reformulated as
⎧
⎪
⎪
N V
⎪
⎪ max g(r).θr
⎪
⎪
⎪
⎪
p=1 r∈Ωp
⎪
⎪
N V

N V
⎪
⎪ s.t. ari θr ≤ 1, ∀i ∈ NW \D, ari θr = 1, ∀i ∈ D,
⎪
⎨

p=1 r∈Ωp p=1 r∈Ωp
⎪ θr ≤ 1, ∀p = 1, ..., NV ,
⎪
⎪
⎪
⎪
r∈Ωp
⎪
⎪
N V

N V
⎪
⎪ ar,ik tr (ik )θr − ar,jk tr (jk )θr ≥ tDk , ∀k = 1, ..., ND ,
⎪
⎪
⎪
⎪ p=1 r∈Ωp p=1 r∈Ωp
⎩
θr ∈ {0, 1}, ∀r ∈ Ω.
(11)
The variable θr ∈ {0, 1} is a decision variable which describes if a trajectory
r is chosen or not. Because of the first and second constraints, the condition
θr ∈ {0, 1}, ∀r ∈ Ω can be replaced by θr ∈ N, ∀r ∈ Ω. The linear relaxation of
problem (11), i.e., with θr ≥ 0, ∀r ∈ Ω, is called Master Problem (MP).
The methodology of column generation approach can be described as follows.
Let Ωp1 ⊂ Ωp , p ∈ NV , and Ω 1 = p∈NV Ωp1 , we consider the Restricted Master
Problem (RMP), denoted by MP(Ω 1 )
⎧
⎪
⎪
N V
⎪
⎪ max g(r).θr
⎪
⎪
⎪
⎪
p=1 r∈Ωp1
⎪
⎪

⎪
⎪
N V N V
⎪
⎪ s.t. ari θr ≤ 1, ∀i ∈ NW \D, ari θr = 1, ∀i ∈ D,
⎨ p=1 r∈Ωp1 p=1 r∈Ωp1

⎪ θr ≤ 1, ∀p = 1, ..., NV ,
⎪
⎪
⎪
⎪
r∈Ωp1
⎪
⎪
⎪
⎪
V
a t −
V
ar,ik tr (ik )θr ≤ −tDk , ∀k = 1, ..., ND ,
⎪
⎪ r,j k r (jk )θ r
⎪
⎪ p=1 r∈Ωp 1 p=1 r∈Ωp1
⎪
⎩
θr ≥ 0, ∀r ∈ Ω 1 .
(12)
The dual program of (12), denoted by D(Ω 1 ), is
⎧
⎪
N
W
NV
N D
⎪
⎪ min λi + μp − tDk .αk
⎪
⎪
⎪
⎪ i=1 p=1 k=1
⎪
⎨
NW N
D
s.t. ari λi + μp + (tr (jk ).ar,jk − tr (ik ).ar,ik ) αk ≥ g(r),
⎪
⎪
i=1 k=1
⎪
⎪ ∀r ∈ Ωp1 , p ∈ NV ,
⎪
⎪
⎪
⎪ λi ≥ 0, ∀i ∈ NW \D, λi ∈ R, ∀i ∈ D,
⎩
μp ≥ 0, ∀p = 1, ..., NV , αk ≥ 0, ∀k = 1, ..., ND .
In this program, λi is the dual variable related to the visitation constraint of

waypoint i, μp is the dual variable related to the implementation constraint of
vehicle p, and αk is the dual variable corresponding to the dependency constraint.
Now we suppose that
(λ̄, μ̄, ᾱ) = (λ̄1 , ..., λ̄NW , μ̄1 , ..., μ̄NV , ᾱ1 , ..., ᾱND )
is an optimal solution of the dual problem D(Ω 1 ). Then, we have

NW
ND
ari λ̄i + μ̄p + (tr (jk ).ar,jk − tr (ik ).ar,ik )ᾱk ≥ g(r), ∀r ∈ Ωp1 , p ∈ NV .
i=1 k=1
It is clear that if this condition holds for all r ∈ Ωp , p ∈ NV , then (λ̄, μ̄, ᾱ) is
also the optimal solution of the dual program of (MP). Otherwise, we will look
for a trajectory r ∈ Ωp \Ωp1 , for a vehicle p ∈ NV such that

NW
ND
ari λ̄i + μ̄p + (tr (jk ).ar,jk − tr (ik ).ar,ik )ᾱk < g(r). (13)
i=1 k=1
This is called the sub-problem.

The column generation-based algorithm is summarized as follows.
Column Generation-Based Algorithm
Step 1. Generate initial sets Ωp1 for each vehicle p = 1, ..., NV .

Step 2. Solve the problem (12) in order to obtain the optimal solution and
its dual solution (λ̄, μ̄, ᾱ).
Step 3. For each vehicle p = 1, ..., NV , solving the sub-problem optimally to
find a trajectory r ∈ Ωk \Ωk1 , and update Ωp1 := Ωp1 ∪ {r}.
Step 4. Iterate step 2–3 until there is no trajectory satisfying the condition
(13).
Step 5. Solve the integer programming formulation of the final RMP to
obtain approximate solution.
3.2 The Sub-problem

Solving the sub-problem leads to solving a problem which is quite similar to
(10), but with only one vehicle p so as to find a trajectory r which satisfies the
condition (13). More precisely, when a vehicle p is fixed, for simplicity in the
following we do not use the subscript p as in (10), and suppose that the number
of its compatible waypoints is NW . Then, we have

NT
NT
NT
ari = bit , tr (ik ).ar,ik = tbik ,t , tr (jk ).ar,jk = tbjk ,t .
t=1 t=1 t=1
Thus

NW
NT T −1
N
g(r) := gi −1 (tr + 2 fr ) = gi bit −1 tr + 2 (|ftx | + |fty |) ,
i∈NW ∩r i=1 t=1 t=0
where N
T
tr = max tbit : i = 1, ..., NW .

t=1
Additionally, we have

ND
ND
NT
(tr (jk ).ar,jk − tr (ik ).ar,ik )ᾱk = (bjk ,t − bik ,t )tᾱk .
k=1 k=1 t=1
Therefore, the condition (13) becomes

NW NT
NT −1
ND NT

(gi − λ̄i )bit − 1 tr + 2 (|ftx | + |fty |) − μp + (bik ,t − bjk ,t )tᾱk > 0.
i=1 t=1 t=0 k=1 t=1
(14)
The left hand side of (14) is the objective function of the sub-problem.
An interesting in this model is the dependency constraint. Although the
master problem has the proper constraints, including the dependency ones, the
effect of the dual variables is not strong enough to drive the sub-problem towards
trajectories that lead to feasible solutions. There is some numerical investigations
which we do not describe more detail here (but hopefully in the longer paper)
show this situation. Therefore, we also use the modified dependency constraints
for the sub-problem as follows

NT
NT
NT
tbik ,t + R(1 − bik ,t ) ≥ tDk + tbjk ,t , k = 1, ..., ND , (15)
t=1 t=1 t=1
where R is the same large, positive number used in (5). If the vehicle p visits both
two waypoints ik , jk : the constraint (15) enforce that the dependency is respected
and the visitation time of waypoint ik must be larger than the visitation time of
waypoint jk plus tDk time units. If the vehicle p visits ik , not jk : the constraint
(15) specifies that the visitation time of waypoint ik must be larger than tDk .
Because only such trajectories maybe appear in the optimal solution. If the
vehicle p visits jk , not ik : the constraint (15) will be the relaxation.
4 Numerical Experiment
The algorithms are written in MATLAB 2015b, and are tested on a MacBook
Air, 1.6 GHz Intel Core i5, 8G of RAM. The solver CPLEX 12.8 is used for
solving the linear program (12), and the sub-problem.
We compare the results obtained by our column generation approach with a
purely CPLEX-based approach for the compact model (10). We have NV = 6
UAVs, NW = 12 waypoints, NZ = 5 No-Fly-Zones, NC = 4, and NT = 25 time
steps corresponding to 25 s. Their positions are presented in the Fig. 1. The
parameters of UAVs are presented in Table 1. The parameters of dependency
are given in Table 2. It requires that the waypoint 6 must be visited after the
waypoint 9 plus one time unit, and the waypoint 1 must be visited after the
waypoint 7 plus two time units. We suppose that each vehicle is compatible with
all waypoint, i.e., Kpi = 1, ∀p = 1, ..., NV , ∀i = 1, ..., NW , and set 1 = 2 = 10−3 .
For this data, we have the MILPs with 5760 binary variables, 1446 continuous
variables and 16116 constraints.
Table 1. Parameter of UAVs
UAV Mass (kg) Initial velocity ωmax (◦ /s) vmax (m/s) fmax (N ) F (N )
X(m/s) Y(m/s)
1 5 0.1 0 15 1.5 1.9635 25
2 5 0.1 0 15 1.5 1.9635 25
3 5 0.1 0 15 1.0 1.3090 25
4 5 0.1 0 15 1.5 1.9635 25
5 5 0.1 0 15 1.0 1.3090 25
6 5 0.1 0 15 1.0 1.3090 25
The gains are integer numbers generated uniformly in the interval [1, 20]. The
10 test problems correspond to the 10 samples of generated gains (see Table 3).
One hour is the limit of computational time for solving the MILP by CPLEX.
For starting column generation procedure we generate some initial trajectories
as follows: we consider the problem (10) with only the set of waypoints having
dependency, and use CPLEX for solving this problem until finding a feasible
solution. The trajectories corresponding to this solution become the initial tra-
jectories Ω 1 .
Table 2. The dependency
ik jk tDk
6 9 1
1 7 2
Fig. 1. Results for the test problem 5: the solution of pure CPLEX with the perfor-
mance 179.865901 (left) and the solution of Column Generation with the performance
184.840890 (right)
Table 3 presents the comparative results between the column generation

method and CPLEX on the 10 test problems. While in most cases CPLEX
can not give the optimal solution within one hour except the problem 3, our col-
umn generation approach produced near-optimal solutions within shorter time
in most cases, and the quality of solution is higher than that of CPLEX. Also,
as expected, we can see the new formulation (11) provides tighter upper bound
than that of the LP relaxation of compact formulation (10). It means investi-
gating a branch-and-price scheme would be a promising direction for globally
solving this model. Figure 1 illustrates the final trajectories of UAVs obtained
by two methods for the test problem 5 in Table 3.
Table 3. Comparative results
Prob CPLEX Column generation

LP relaxation Obj. value Time (s) Dual bound Obj. value Time (s)
1 211.992952 203.858889 3600 203.858890 203.857890 738
2 196.991954 178.852894 3600 186.247206 185.844877 305
3 200.990944 194.860890 961 194.860892 194.856891 472
4 218.990941 194.866876 3600 206.293988 202.864870 304
5 202.992953 179.865901 3600 184.843890 184.840890 1976
6 194.991954 185.861882 3600 190.051080 186.843879 378
7 197.990951 185.852879 3600 187.854881 187.854878 817
8 215.993944 199.869905 3600 199.958445 199.867897 787
9 215.992949 195.861886 3600 199.105930 198.840886 314
10 219.992940 208.846897 3600 210.065077 208.859872 933
5 Conclusion
In this paper, we have proposed a column generation approach for solving the
coordination and control of multiple UAVs where the dependency constraint is
mainly handled in the Master Problem and also treated in the sub-problem to
avoid generating infeasible trajectories. The comparative results have demon-
strated the efficiency of our approach in comparison with CPLEX for the com-
pact model. In future work, we will develop some branching techniques to get a
Branch-and-Brice scheme for global solution, as well as solve the sub-problems
in parallel processing.
Acknowledgement. The authors would like to thank the DGA (Direction Générale
de l’Armement) for the support to this research.
References
1. Barnhart, C., Johnson, E.L., Nemhauser, G.L., Savelsbergh, M.W.P., Vance, P.H.:
Branch-and-price: column generation for solving huge integer programs. Oper. Res.
46(3), 316–329 (1998)
2. Bellingham, J., Tillerson, M., Richards, A., How, J.P.: Multi-task allocation and
path planning for cooperative UAVs. In: Butenko, S., Murphey, R., Pardalos,
P.M. (eds.) Cooperative Control: Models, Applications, and Algorithms, pp. 23–41.
Kluwer Academic Publishers (2003)
3. Chandler, P., Pachter, M.: Research issues in autonomous control of tactical UAVs.
In: Proceedings of ACC 1998, pp. 394–398 (1998)
4. Chandler, P.R., Pachter, M., Rasmussen, S.R., Schumacher, C.: Multiple task
assignment for a UAV team. In: Proceedings of the AIAA Guidance, Navigation
and Control Conference (2002)
5. Le, T.H.A., Nguyen, D.M., Pham, D.T.: Globally solving a nonlinear UAV task
assignment problem by stochastic and deterministic optimization approaches.
Optim. Lett. 6(2), 315–329 (2012)
6. Le, T.H.A., Nguyen, D.M., Pham, D.T.: A DC programming approach for planning
a multisensor multizone search for a target. Comput. Oper. Res. 41, 231–239 (2014)
7. Lübbecke, M., Desrosiers, J.: Selected topics in column generation. Oper. Res.
53(6), 1007–1023 (2005)
8. Nguyen, D.M., Dambreville, F., Toumi, A., Cexus, J.C., Khenchaf, A.: A column
generation based label correcting approach for the sensor management in an infor-
mation collection process. In: Advanced Computational Methods for Knowledge
Engineering, pp. 77–89 (2013)
9. Richards, A., Bellingham, J., Tillerson, M., How, J.: Co-ordination and control of
multiple UAVs. In: AIAA Guidance, Navigation, and Control Conference (2002)
10. Schouwenaars, T., DeMoor, B., Feron, E., How, J.: Mixed integer programming for
safe multi-vehicle cooperative path planning. In: ECC, Porto, Portugal (2001)
11. Simonin, C., Le Cadre, J.-P., Dambreville, F.: A hierarchical approach for planning
a multisensor multizone search for a moving target. Comput. Oper. Res. 36(7),
2179–2192 (2009)
12. Walker, D.H., McLain, T.W., Howlett, J.K.: Coordinated UAV target assignment
using distributed tour calculation. In: Grundel, D., Murphy, R., Pardalos, P.M.
(eds.) Theory and Algorithms for Cooperative Systems. Series on Computers and
Operations Research, vol. 4, pp. 327–333. Kluwer, Dordrecht (2004)
Spare Parts Management in the Automotive
Industry Considering Sustainability
David Alejandro Baez Diaz1, Sophie Hennequin2,

and Daniel Roy2(&)
1
LARIS Laboratory, Angers University, 62 Avenue Notre Dame Du Lac, 49000
Angers, France
david.baezdiaz@etud.univ-angers.fr
2
LGIPM Laboratory, Lorraine University, 1 Route D’Ars Laquenexy, 57078
Metz, France
{sophie.hennequin,daniel.roy}@univ-lorraine.fr
Abstract. Spare parts are a fundamental part of the automotive industry, even if
they are intended for out of series market and aftermarket products. Throughout
this study, a process for formulating an inventory model for spare parts is
presented through this paper, a proposal for an inventory management system
applied to automotive spare parts and based on forecast and simulation is pre-
sented. The objective is to improve the implementation of an inventory system
in order to reduce transportation, storage and production costs by studying the
behavior of demand, the applicability of an inventory management policy and
the use of simulations to test the proposed results. This allows correcting
parameters to avoid stock shortage and to integrate sustainable paradigms.
Keywords: Inventory management system Demand forecast

Simulation based on mathematical model
1 Introduction
The development of production strategies is essential for the growth of a company,

especially in automotive industry since it is characterized by a constant evolution:
constant development of new technologies and a continually evolving range of prod-
ucts. When a vehicle is no longer marketed, it is not the end of production of its
components. Indeed, it is necessary to offer customers spare parts for a period of 10
years minimum, (this may change depending on the customer). Therefore, management
of spare parts is very important in the automotive market. Furthermore, the manage-
ment of factories (resources and spaces) is one of the key factors to optimize the
manufacturing.
In this perspective, the objective is to identify end-of-life semi-finished products
and to ensure a physical stock for them allowing the reduction of the supply chain costs
considering sustainability. In this paper, we only consider carbon emission costs but
ours proposed mathematical model and resolution method could include others envi-
ronmental and social costs. The considered semi-finished products are spare parts for
vehicles, i.e. goods that are intended to be mounted in or on a motor vehicle to replace

https://doi.org/10.1007/978-3-030-21803-4_109
1110 D. A. B. Diaz et al.
components of that vehicle, including goods such as lubricants that are necessary for
the use of the motor vehicle. As part of our project, we collaborated with a supplier,
located in France, of car producers. The different collected data show the impact of
spare parts and the importance of inventory management of components and finished
products while ensuring high service levels required by customers (very low delivery
times and very little shortages). For this, it is necessary to know the finished products to
be made in each period, the finished products to be stored, the periods to order com-
ponents, the quantity of components to be stored and the quantity of components to use
in production per period [1].
Therefore, the main purpose of this paper is firstly to propose an efficient demand
forecast. Indeed, demand forecasts are one of the key issues in logistics since many
factors are involved and responsible for obtaining good forecast results [2]. For that, we
start from the idea developed in Hubert’s PhD dissertation, which proposes to test
different models of forecasts in order to retain the best [3]. To measure the performance
of the proposed forecasting system, we choose to develop an Analytical Hierarchical
Process (AHP), which allows having a good decision-making by involving structuring
criteria into a hierarchy [4]. Saaty [5] gives a good description of AHP processes. Once
the demand is efficiently estimated, the project has a good basis for starting the eval-
uation of the inventory management system.
To define the inventory management policy with components to keep, different
methods can be developed to find the optimal quantity to order when the demand
changes over time and an order is considered per period. This is known as the lot sizes
approach [6]. Several models could be defined based on determinist or stochastic
methods [7] but generally, the main objective is to calculate/estimate an economic lot
size [8]. Therefore, we develop an integer linear mathematical model, which corre-
sponds to the minimization of the sum of transportation, storage and production costs
but also carbon emission costs on the supply chain [9]. This kind of mathematical
models could be complex to solve. The resolution strategy could be conducted based
on a simulation method.
This article breaks down as follows. The second section presents the industrial case
study and the main hypotheses. Then, the mathematical model is presented. The fourth
section develops the proposed resolution algorithm and the chapter five gives some
numerical experiments. Finally, a discussion is proposed and future works are
expressed.
2 Spare Parts Management in Automotive Industry
We consider a supplier of automobile producers. This supplier has several raw material
and components suppliers and several customers. The industrial supplier is imple-
mented in Europe such as this customer with different factories and warehouses of
finished products and components in the world. Some warehouses could be dedicated
only to raw materials and components or to finished products or a mix of both.
As part of a research project, this automobile industrial supplier asked us to provide
a decision support tool allowing to size the need of components to order knowing the
need of finished products to be produced/outsourced over a period of finite time, and, to
Spare Parts Management in the Automotive Industry 1111
deduce the optimal strategy for storing these different elements in its platforms or those
of its suppliers in case of orders.
The major difficulty of this work (apart from the complexity related to the number
of different variables to be considered and the constraints) is to predict the demand for
the sold finished products knowing that these are semi-finished products that can be
used in the vehicle manufacturing or maintenance. In case of maintenance, it is nec-
essary to store during several years these semi-finished products. It is therefore difficult
to estimate future needs knowing that we must consider the maintenance of end-of-life
finished products that are no longer sold, as well as the fact that some semi-finished
products will be reconditioned from several semi-finished products. In addition, the
semi-finished products are made from various components that can be used in several
semi-finished products, the same goes for semi-finished products that can be used in
several vehicles.
To simplify the study and the data recovery, we focus on a particular customer. We
are also interested in a single warehouse with ordered products (which can be
components/raw materials or outsourced finished products) and finished products
manufactured from factories of our industrial supplier.
In parallel, the considerations in terms of sustainable development become more
and more important for customers considering the traceability of products (origin,
ecological footprint, composition, elimination, etc.) and society (taxation, require-
ments, lobbying, public regulations, etc.), and push companies to deploy new strategies
incorporating this concept. For firms (and individuals) the integration of sustainable
development is based on the “triple bottom line” approach where a minimum perfor-
mance must be achieved for each of the three dimensions of sustainable development
(economic, environmental and social).
Two approaches can be taken: (i) an internalization of sustainability that is, the
factors that improve the company’s performance are integrated into the company’s
strategy (this often leads to additional costs); or (ii) the externalization of sustainability,
in which case the company decides to postpone the problem to its subcontractors and
service providers but without making any real effort to improve its internal perfor-
mance. In this work, we consider both aspects according to the requirements of the
industrial supplier. Thus, the factors of improvement of the sustainability are integrated
in the form of economic costs at first in order to simplify the study.
The next sections detail our methodology and our approach.
3 Mathematical Model
The objective of the lot size modeling is to find the optimal quantity to order when the
demand changes over time considering an order per period. So, it is necessary to take
into account the production and inventory variables and decision variables representing
a command. Notably, it is a problem that can be modeled as an integer linear math-
ematical model.
The defined variables and parameters are given in Table 1.
Table 1. Notation.
i A finished product (to be produced by the firm)
j A production period
k A spare part of the product i
l An element of the supply chain which can be:
Xij Quantity of i to be produce in period j
Aij Quantity of i to be stored in period j
Ckj Quantity of k to be used in period j
Bkj Quantity of k to be stored in period j
Elj Quantity of carbon emitted during period j for element l which can be the
transport mode, the supplier, the warehouse or the supplier’s factory
Oij Binary decision variable which represents the fact we should produce i during
period j (Oij=1) or not (Oij=0)
Ykj Binary decision variable which represents the fact we should order k during period
j (Ykj =1) or not (Ykj =0)
Dij Demand of product i for period j
Rkj Need of spare part k during period j
Cok Ordering cost per unit of spare part k
Cf Storage cost per stored part
Wi Production cost for the product i
Ctk Transportation cost per unit of spare part k
Ccl Carbone emission cost for the element l
The objective function is described in what follows. It corresponds to the mini-

mization of transportation, storage and production costs and cost related to carbon
emissions in the factories and during transport of components. It is given by:
X
I X
J I X
X J K X
X J K X
X J
minZ ¼ Oij Wi þ Aij Cf þ Pkj Cok þ Bkj
i¼1 j¼1 i¼1 j¼1 k¼1 j¼1 k¼1 j¼1
L X
X J
Cf þ Elj Ccl ð1Þ
l¼1 j¼1
The constraints allow to make a balance period by period. In this way, the problem
keeps a coherence between the production of the current period and the storage of the
last period. They are given as follows.
Ai0 ¼ 0 8i ð2Þ
Bk0 ¼ 0 8k ð3Þ
Xij þ Aið j1Þ ¼ Dij þ Aij 8 i; j ð4Þ
X
n
Ckj þ Bkð j1Þ ¼ Xij Rki þ Bkj 8 k; j ð5Þ
i¼1
Ckj M Ykj 8 k; j ð6Þ
Xij M Oij 8 i; j ð7Þ
Xij 0; integer ð8Þ
Aij 0; integer ð9Þ
Ckj 0; integer ð10Þ
Bkj 0; integer ð11Þ
Oij : ½1; 0 ð12Þ
Ykj : ½1; 0 ð13Þ
Equation (2) represents the initial inventory of finished product i and Eq. (3) rep-
resents the initial inventory of component k. Equation (4) represents the limited
capacity for the finished product storage i and Eq. (5) represents the limited capacity
for the component k storage. Equation (6) represents the maximum amount of com-
ponent k and Eq. (7) represents the maximum quantity of finished product i. A service
level of 95% is chosen for the project.
An analysis of the programming model shows 1152 variables to be considered (9
finished products, 22 components, 12 periods of horizon and a variable for each type of
component and finished product). The resolution complexity is a real problem even if
the linear and integer programming model is the best option. To simplify our problem
and reduce the quantity of variables to be considered, the project will seek the mini-
mization of costs only for the finished products since they have a greater economic
value than components.
The next section will present our approach to solve the proposed integer linear
mathematical model.
4 Resolution Method
The behavior of the demand conditions the type of system. In this project, we can
consider systems with deterministic or stochastic demands. To find the best demand
prediction, we chose to have a multi-criteria approach to compare different possible
modeling. This multi-criteria approach is based on an AHP model [5] not detailed in
this paper.
To establish the variability of the demand it is necessary to consider the coefficient

of variation. [6] explains that for a coefficient of variation greater than 20%, the
demand is probabilistic with seasonality or probabilistic without seasonality (otherwise
the demand is deterministic). Once the behavior is identified, it is necessary to make an
assessment of the applicability. It considers the risks, weak points and strengths of the
application of a stock policy.
The objective of the simulation is to check each calculated element and to allow an
approximation of reality to reduce risks and correct errors. The simulation will focus on
the level of stock and its evolution over time. A negative inventory represents a
shortage so the objective will be to change some parameters to get a more reliable
system. In what follows, we firstly introduce the Petri net (PN) used to describe our
system and then the proposed algorithm.
4.1 Petri Net Approach

PN is a graphical and mathematical tool for performing simulations and modeling of
event systems. PNs are composed of at least one token to indicate the evolution of the
network, places that contain tokens, transitions that represent the divergence or con-
vergence and arcs that indicate the crossing of the tokens from one place to another,
using transitions [10].
Our proposed Petri net for a semi-finished product is given in Fig. 1.
T2
Legend
P5 P2
P1: pending demand
P2: pending order
P3: stock of finished products
T1 T3 P4: quantity of delivery orders
P5: number of orders
P6: demand level
P6 P1 P3
T1: Arrival of a customer re-
quest
T2: warehouse order
T3: Arrival of a factory request
T4 T4: Starting a customer delive-
ry
P4
Fig. 1. Proposed Petri net for finished products.
It has to be noticed that a timed transaction (T4) and two places (P1 and P6) are
added to simulate the delivery time and to have a count of the quantity of orders, and
another place to see the evolution of the demand.
A change in modeling (more products or components for example) is possible

because there would never be a queue of requests. The goal is to ensure a level of
service in which there is no shortage, so all the demand is covered. That is, each time a
request of size N arrives in P1 so that T4 is activated, it would be necessary to have at
least the same quantity N available in P3. The initial marking, located in P3, represents
the initial stock.
To simulate our complete Petri net, we used the Excel VBA tool at the request of
the industrial supplier.
In what follows, we give our proposed algorithm.
4.2 Proposed Algorithm

The principle of our algorithm is as follows. First of all, we should identify the request
of all semi-finished products for a given period. Once the request is known (done by
our AHP), a check must be made to minimize the error of the estimated demand
forecast. The error will highlight what is the correct prediction for the behavior of the
demand. It should be noted that some parameters such as mean and mean squared error
[11] are generally used in the formulation of an inventory management policy. Then,
the different inventory policy options (order quantity and time) and their impact on the
objective function are calculated. The goal is to find the best inventory policy for a
period by considering the behavior of the request. The results obtained by the math-
ematical calculations are verified by simulation through our Petri net.
Our proposed algorithm is given in Fig. 2.
The proposed algorithm has been compared with the real system on the basis of
histories. In case of strong evolutions, this algorithm will have to be improved by
considering more options (not realized for the moment).
To highlight our work, and based on collected industrial data, we conduct different
numerical experiments, which are given (in part) in the following section.
5 Numerical Results
To collect the data, it is necessary to make a historical revision of all the needs month
by month (a month is a period of study for the automotive supplier). We consult the
needs of the last two years. The collected data are not given in this paper for the sake of
confidentiality.
The results are given for two different inventory policies: the (Q, R) policy which
consists of order a quantity Q whenever the stock level is below the recommended
point R, and the lot by lot policy which consists in order a lot at each period of study.
The lot could be an economic order quantity [6].
The obtained results are given in Table 2.
The following table contains the comparison of the current policy (the (Q, R)
policy) and the proposal. A lot-by-lot system calculation shows that implementing this
inventory policy saves 23% of total costs (of course it depends on the parameters). At
the same time, it allows to have a vision of the behavior of the demand and to establish
security to avoid the shortages and the customer service level. It is a reconciliation to all
Fig. 2. Proposed algorithm.
Table 2. Numerical results – two inventory policies.

Lot by Lot policy (Q, R) policy
Total cost Z = 4773.6 MU Total cost Z = 3663.95 MU
No security stock. The system is not prepared Security stocks pending variation in
for a change in the demand demand. Consumption in time of supply
Several production launchings Low amount of production launchings
No use of location for storage Using a specific location for storage
No knowledge of the quantity to be produced, Well defined lot size, fixed quantity
therefore variability of the lot size
No need to refresh the settings Settings corrections when needed
the strong or weak points of the new inventory management system, not only on an
economic side but also in an operative way integrating environmental considerations.
6 Conclusion
The proposed methodology is based on real data from the automotive industry provided
by a supplier of automotive producers as well as its needs and constraints. These
collected values have highlighted the importance of having efficient forecasts of spare
part demands even more for components arriving at the end of their life but having to
be kept for the maintenance of the vehicles of the various customers. Thus, several
forecasting techniques were used and the use of the AHP allowed to choose a pre-
diction technique balanced against error, application and adaptation.
Once a forecasting technique has been selected, we must choose whether the parts
are ordered, the order quantity and the replenishment date (the study horizon is finite
and decomposed into periods). To do this, based on an integer linear mathematical
model, we propose a resolution algorithm based on a Petri net. It has to be noticed that
in this paper, we only present results for semi-finished products to simplify the pre-
sentation. Then, we propose numerical experiments to highlight our results, which
shows how the implementation of an inventory system is possible by considering the
behavior of the application, the industrial application and the monitoring of the results.
The proposal highlights an economic and operational improvement for the company in
the case of the management of the alternative products.
The research project is the result of the use of approximate methods. In practice the
effect of working with an estimate of demand will not ensure an optimal result.
However, the proposal for an inventory management system with continuous revision
allows the possibility to adapt to each situation.
Mathematical modeling highlights the degree of complexity of this industrial
problem. In the paper, we have studied only one segment of the global problem. The
development of inventory policies and forecasts should be a deeper analysis. In the
same way we should include more stochastic cases to be closer to reality.
The perspectives of this work are: on the one hand, to propose resolution methods
closer to the obtained mathematical model. Indeed, our proposed Petri net remains
simple in its definition and therefore does not allow to include also all components. On
the other hand, we should identify all factors and costs induced and integrate them in
our work. Then, we could propose a real time decision-making tool for the complete
inventory management of spare parts for automotive industry (suppliers and
producers).
References
1. Alfarez, H., Turnadi, R.: General model for single-item lot sizing with multiple suppliers,
quantity discounts, and backordering. Proc. CIRP 56, 199–202 (2016)
2. Bussay, A., Van der Velde, M., Fumagalli, D., Seguini. L.: Improving operational maize
yield forecasting in Hungary. Agric. Syst. 141, 94–106 (2015)
3. Hubert, T.: Prévision de la demande et pilotage des flux en approvisionnement lointain. Ph.
D. Thesis. École Centrale Paris (2013). (Chapitre 1. Pages 10–12 – in French)
4. Gupta, S., Dangayach, G., Kumar, A., Rao, P.: Analytic hierarchy process (AHP) model for
evaluating sustainable manufacturing practices in Indian electrical panel industries. Proc.
Soc. Behav. Sci. 189, 208–216 (2015)
5. Saaty, T.L.: The Analytic Hierarchy Process. McGraw-Hill, New York (1980)
6. Rahjans, N., Samak, S.: Determination of optimum inventory model for minimizing total
inventory cost. Proc. Eng. 51, 803–809 (2013)
7. Erol, S., Jäger, A. Hold, P. Ott, K., Sihn, W.: Tangible industry 4.0: a scenario-based
approach to learning for the future of production. Proc. CIRP 54, 13–18 (2016)
8. Anderson, D.R., Sweeney, D.J., Williams, T.A., Camm, J.D.: An Introduction to
Management Science: Quantitative Approaches to decision making, p. 912, 15th edn.
Cengage Learning (2018)
9. Hennequin, S., Ramirez Restrepo L.M.: Fuzzy model of a joint maintenance and production
control under sustainability constraints. In: Proceedings of the 8th IFAC MIM 2016
Conference, Troyes, France (2016)
10. Hu, H., Zhou, M.C.: A Petri net-based discrete-event control of automated manufacturing
systems with assembly operations. IEEE Trans. Control Syst. Technol. 23(2), 513–524
(2015)
11. Sanjoy, K.: Determination of exponential smoothing constant to minimize mean square error
and mean absolute deviation. Global J. Res. Eng. 11(3), 31–33 (2011)
The Method for Managing Inventory
Accounting
Duisebekova Kulanda, Kuandykov Abu, Rakhmetulayeva Sabina(&),

and Kozhamzharova Dinara
International Information Technology University, Almaty 050054, Kazakhstan

dkulan1@mail.ru, ssrakhmetulayeva@gmail.com
Abstract. The paper consider processes of management and control of ware-

house transportation are investigated. At the same time, the optimal location of
the warehouse is determined, taking into account its volume and the intensity of
the cargo receipt. The issues of automation of warehousing of cargo and
transportation are considered. The system of automation of business processes
of logistics is developed.
Keywords: Logistic Logistic of warehousing Transport tasks

Transportation of logistic Area of storage Capacity of storage
1 Introduction
This article considers the main sections in the logistics industry - warehouse and
transportation departments. Also, the capacity of the commodity warehouse, the pos-
sibility of storing the fund and the ways of determining the capacity of the warehouse
were considered. Effective ways of transportation calculations were calculated to save
time. The sphere of logistics has only just begun to develop. But despite this
automation device are developing at a good pace. The main feature of logistics is the
achievement of maximum income using minimum costs. At present, logistics is a major
source of processes generating competition. In addition to warehousing, transport,
information and production logistics provides the customer with a quality and reliable
service.
2 Objective
2.1 Creation of a Mathematical Model with the Application of New

Methods
The objective of this article – the creation of a mathematical model with the application
of new methods, aimed at facilitating the work of the subject areas of ware-house
settlements. With the use of optimization methods to reach the maximum source of
income with minimal costs. Solving the problems encountered by considering the
processes with the possible ways of delivering the product to the consumer. All
received information should be stored on the server with databases.

https://doi.org/10.1007/978-3-030-21803-4_110
1120 D. Kulanda et al.
2.2 Relevance of the Topic

Relevance of the topic – creating a model of automation of a part of the logistics area
using information technology, achieving the maximum source of revenue by saving the
distribution time. To achieve this goal, the following objectives must be done:
– Analysis of the logistics industry;
– Implementation of the program for the analysis of the results of reporting;
– Implementation of works performed through calculations of warehouses and
transportations in the field of logistics.
2.3 Concept and Evolution of Logistics, Efficiency of Use

The term “Logistics” comes from the Greek word “logistic” and translates as making
informed decisions, the skill of practical calculation. The experience of applying the
logistics approach in various areas, mainly in organizational and managerial work
(which requires a minimum amount of costs) shows an increase in economic efficiency
in the tens, sometimes hundreds of percent.
Logistics methods combine various aspects of economic activity, their various
stages, organization and management. Therefore, there are different definitions of
logistics and the authors, put forward the different aspects of logistics management.
Logistics has a specific content and is aimed at solving the issues of its framework.
Here is the reference from the terminological dictionary:
“Logistics” (logistic) is the science of planning, managing and monitoring opera-
tions for warehouse and processing, transportation and warehousing and supply of
materials to the enterprise, processing of factory raw materials, materials and semi-
finished products and providing finished products to consumers taking into account
their wishes and requirements [1].
The scope of logistics means managing the flow of resources between the start and
end points to meet the requirements of customers or corporations. Logistics is one of
the main functions of many enterprises. The main objectives of logistics can be con-
ditionally divided into efficiency and cost. These include: short delivery times, main-
tenance of property at a low level and high-level use of production capacity.
Support logistics consists of activities such as market research, needs planning,
purchasing decisions and procurement control.
Production logistics integrates supply and distribution logistics. The main purpose
of distribution logistics is to deliver the finished product to the customer. This in turn
consists of processing, storing and transporting the order.
The main function of disposal logistics is to reduce the value of logistics associated
with the elimination of waste that emerged during business processes and increase the
types of services. One of the main duties of logistics is the delivery of the product to
equipment, sale, production, packaging on time and with minimal costs (Table 1).
During the logistics process, the material flow is delivered to the enterprise, then the
optimal movement is organized through the warehouse and production sites, then the
product is delivered to the customer (see Fig. 1).
The Method for Managing Inventory Accounting 1121
Table 1. The main responsibilities of logistics.

Responsibilities of logistics
Scale General Individual
– Achieving high efficiency – Construction of an integration – Construction of
with low costs during system for regulating material small funds
unstable market conditions and information flows – If possible,
– Modeling conditions for – Monitoring the movement of reducing the
reliable operation of material flows warehouse time of
logistics systems – Identification of technologies and products
strategies for physical movement – Reducing the time
of goods of transportation
– Standardization of semi-finished
products
– The forecast of volumes of
production of release, moving and
warehouse in a warehouse
– Forecast of demand for goods
produced and moving in the
territory of the logistics system
– Distribution of parts of transports
– Organization of services to
consumers before and after
purchase
– Optimization of technological
structures for a complex of
automated transport-warehouses
Fig. 1. Scheme between material and information flows.
The entire path of material transfer presented in the scheme can be divided into two
large sections:
– in the first section the products of the production and technical direction are trans-
furred;
– in the second - the products of which people consume.
2.4 Warehouse Accounting

One of the issues discussed in the accounting is the warehouse of inefficient goods, an
increase in the costs of the warehouse and the stoppage of the turnover of goods
supplied to the warehouse. These unpleasant events can be solved with the help of
effective and simulation modeling. Effective modeling regulates the following two
situations: in the production process, the quantity of the stock in the warehouse must be
sufficient and the warehouse process must occur at minimal cost. The basic statement of
modeling:
– reducing the quality of raw materials and finished product;
– prevention of reducing the number of goods that are consumed many times in the
sales process;
– reduction in the number of goods that take up a lot of warehouse space and are not
sold;
– make monthly statistics of the stored fund in the warehouse;
– timely tracking of the stock of finished products
– increase of turn over of cargo [2] (Fig. 2).
Fig. 2. Graphical model of warehouse.
Thus, the model is a supporting tool for creating a business process and systems to
ensure that the general requirements that are imposed on the business process are met [3].
The formulation of the problem on the process of building business processes:
• accelerate the process of business process and the creation of systems according to
time TB ! min or RiTB ðOpiÞ ! min;
• improve the quality indicators of the business process and the creation of systems
Kп ! max or RiKп(Opi) ! max,
• reduce labor efforts Tp ! min или RiTp(Oтi) ! min;
– where Tв and Tв(Opi) is an indicator of time: the total time spent on creating a
business process and the time it takes to complete each Opi operation; Kp и
Kp(Opi) - quality indicators: common for the business process (and/or system)
and for the performance of each Opi operations.
These requirements must satisfy the business process model. On the other hand, it
is very difficult to build a universal model for all industries. Therefore, this paper
discusses the construction of a model for a class of business processes in a specific area,
namely for LP (logistic process), which should ensure the creation of logistics business
processes and an automation system for a given business process [4]:
• Creation of a model that provides descriptions and constructions of a wide class of
business processes in the sector and automation system, i.e. (KS ! max).
Where KS is the number of business processes and automation systems
• The list of implemented functions for each generated business process and system
should be wide enough for missions (KF ! max), i.e. full functionality for each case
of the creation of the system. Where KF is the number of implemented functions.
• The completeness level of each function should be sufficient for the mission (ZF
max) to complete the business process and system. Where ZF is the level of
completeness of each implemented function.
3 Local Business Process Models: Choice, Purpose

and Operation
Define the purpose and function of each individual local model in this way. A business
process is an object of the outside world. And any object of the external world is
characterized by a conceptual representation, i.e. place in the “world of things”, which
occupies this object among other objects and a set of distinctive properties, as well as
the nature of communication with other objects of the external world. Therefore, the
business process as an object of the outside world should be characterized by a concept,
i.e. conceptual representation [5].
And, as is well known, the conceptual features of an object (that is, a business
process) must be presented in the form of a special model - a conceptual model.
Note that the object is an element of a united information space (UIS), hence the
conceptual model of the business process is an element of the UIS.
It should be noted that the object is conceptually presented for which purpose
separately. The business process is intended for production and is a managed object.
Therefore, the CM of a business process is characterized by its mission, targets or
purpose and criterion, input and output (result) of data.
• In addition, the composition of the input and output depends on what for (for what
purpose) we build the CM for the BR. It should be noted that the CM is building to
solve the problem of integration. Therefore, for us, the CM needs to ensure that it
integrates the business process of logistics with the business processes of other
organizations, for example, at the top level with partners (suppliers and consumers
of goods and machines and equipment) of logistic processes;
• at the lower level with business processes of other, for example, neighboring local
problem areas.
Thus, our business process must be able to integrate with the business processes of
other organizations. Accordingly, the inputs and outputs of the CM at the level of
the logistics business process should be harmonized with the business processes of
other organizations [6]. And for integration, the following data is needed:
• the internal structure of the logistics sector, local problems of the region, its com-
position, capacity;
• objects of labor, source and flow of goods, types of goods;
• means of labor, which means of transporting goods between warehouses and cus-
tomers, transportation of goods within the warehouse;
• what outsourcing operations are available, etc.
Since the field of logistics consists of two levels: the general problem area of
logistics and local problem areas, which constitute the general problem area, while they
have a different environment and environment.
3.1 External Conceptual Model (CM1)

CM1 serves as a data/information transfer tool for processing procedures, sets or dis-
plays a characteristic of a business process for integration with external business
processes and links of a logistics business process showing goals of a given super-
system at the macro level, information about its business process of a common problem
area as an element of a united information space (UIS).
In addition, CM1 should contain information for external organizations on the
internal structure of the logistics business process: a list of local problem areas and a
metamodel of various integrations by different classes of business processes [6].
3.2 Internal Conceptual Model (CM2)

CM2 business process of the local problem area of logistics, will be written in the form:

CM2 ¼ CM2j : j ¼ 1; m ð3:1Þ
where the j-th local problem area sets the information as:
(1) a list of specialized processes included in the created business process of the local
problem area.
(2) the number and types of specialized processes dependent on the problem being
solved and on the characteristics of the business process itself and its specialized
processes.
(3) on the specialized processes of a given business process in a local problem area,
(4) as well as the metamodel (descriptions) of the integration of specialized processes
within a business process for a specific purpose within the problem area.
(5) input and output data characterizing this business process as an element of a single
process information space (UIS) of the second level.
The model is designed to automate business processes. Therefore, we construct a
model for classes of business processes that are observable and controlled.
It is generally accepted that the business process model is represented in two ways
either as it is or as it should be. In order to present the business process “as it should be”
(and should be observable and manageable), it is necessary to conduct the business

process from the model presentation “as is” by carrying it into a model presentation as
it should be.
This is achieved by reengineering. Business process reengineering (BPR) is defined
as a “fundamental rethinking process for all business metrics such as cost, speed,
quality, and service.” BPR or not. Although it is a concept of BPR programs, it has
been one of the most successful programs (e.g. 70%) [7, 8].
Business process reengineering must be carried out for a specific purpose, for
automation, for monitoring and for managing the business process, which as a result of
the organization needs methods and the integration of knowledge management models
to understand the environment, which includes processes, people, employees, cus-
tomers and tools.
To obtain a managed version of a business process, it is achieved by entering a
control loop that performs a number of coordinating and controlling functions that are
realized from processes consisting of operators, for example, processes (consisting of
operations): strategic processes (solutions), logical-operator, service (strategic model,
logical model, operator model, service model. C = {cijk}, k = 1, Kij) and this is for
each specialized BP process: analytics, administration, organization, management,
technological processes, provision of resources, services in the local problem area, for
example, B = {bij}, j = 1, Ji. Manageability is also allowed by the introduction of
additional functions - and this is achieved by services.
Processes or operators are controls. And the means (control actions) of management
are specialized processes and additionally introduced operations [7].
Such a model is built for each local area, which we have isolated from the logistic
process: receiving goods, storing and picking goods, shipping, shipping, receiving a
temporary storage warehouse, etc. A = {ai}, i = 1, I.
Therefore, in the business process model (i.e., in the base model) we introduce
additional operations that are created from outside by the model developers themselves,
i.e. no in the BP itself. These are operations of strategies, logic, operators and services.
Thus, the manageability of a business process is gained by introducing a control
loop. Moreover, first of all, the strategic process of the control loop, which is the
beginning of the control process. Therefore, to give BP a managerial character, we
introduce the concept of a strategic process.
3.3 The Strategic Model (SM)

The strategic model (SM) for all local problem areas is written as: SM = {SMj: j = 1,
m}, where the j-th local problem area. Here, just as with {CM2j}, we assume that the
local problem area includes: receiving goods, storing and picking goods, shipping,
shipping, receiving a temporary storage warehouse, etc.
The strategic model constitutes a strategic level of business process management
which is designed to construct, before performing a business process task, the content
and structure of the business process based on the current situations that have arisen in
the production environment, i.e. in a business process environment.
In all local problem areas of logistics, the strategic model of the SM business
process determines the option of jointly performing specialized processes based on the
current situations in production before the business process. The variant of joint per-
formance of operations of specialized processes may be different [7].
At time t, the production situation for a business process is determined as follows:
SPðtÞ ¼ \ZðtÞ; JbðtÞ; SlðtÞ; EPðtÞ; BPðtÞ [ ; ð3:2Þ
• SP(t) - production situation that arises before the execution of the business process,
• Z(t) - the purpose or purpose of the business process,
• Jb(t) - setting at the current time,
• Sl(t) - the subject of labor at the current time,
• EP(t) - factors and objects of the external environment that have a direct impact on
the implementation of the business process,
• BP(t) - the state of the business process, characterized by the values of the business
process indicators.
To make strategic decisions, production situations are divided into two classes of
situations:
If for the current production situation the conditions SP(t) 2 SP1, are met, then the
necessary list of specialized processes is selected (the necessary list of types of active
specialized processes) that are necessary to perform the specified task by this business
process, as well as their priorities for each of them. In the current production situation
of SPðtÞ 2 Sp2 , the k-th option of the specialized process is selected.
If SPðtÞ 2 Sp3 , then for the selected variant (k-th variant) of a specialized process,
the set of operations is determined, and the meta scheme for performing the sequence
of operations, in which the allowed combinations of the sequence of operations (Oph
Opk) reflect given the current situation. An admissible combination of an operation is
established on the basis of the semantics of a relation, which is determined from the
ontological model. Expression (Oph Opk) has the following meaning: a sequence is a
valid combination of a sequence of operations where Oph is the h-th class of opera-
tions, Opk is the k-th class of operations, is a sequence operation. This model performs
the role of a scheduler that plans to complete the business process of an upcoming order
or order.
4 Logical Model of the J-th Specialized Process (LMJ)
Consider the purpose and principles of action of the logical model (model of decision
making) of the j-th separate specialized process from the stack of the specialized
process of the business process of the local problem area, i.e. j = 1, J.
The strategic model determines when and in what sequence processes are applied
and executed. All of these methods are a process of organization and management. The
purpose of the logical model is to define the sequence of business operations of each
special process. Each business transaction consists of two parts: an operator and a
procedure. Therefore, the logical model contains two levels [8].
At the top level, selects operators selected a single character representation of a

specialized process from a stack of specialized processes of a business process of a
local problem area (for example, technological or organizational, or providing process
resources) of one and that special process and follow these operators from operations.
The choice is made on the basis of the current situation (i.e., given the initial situation
of the problem area).
For two production situations, the composition and sequence of operations within a
single logical model may differ. For example, in the initial situations Stek(i) 2 Sst и.
Stek ðjÞ 2 Sst , in particular, in the following form has the form:
In a production situation Stek ðiÞ 2 Sst :
Pri ¼ \Opi1 ; Opi2 ; Opi3 ; Opi4 ; . . .:; Opit ; Opit þ 1 ; . . .:; Opi [ ; ð4:1Þ
In a production situation Stek ðjÞ 2 Sst :
Prj ¼ \Opj1 ; Opj2 ; Opj3 ; Opj4 ; . . .:; Opjk ; Opjk þ 1 ; . . .:; Opj [ ð4:2Þ
where: Pri, Prj - specialized processes of business process BP;

mi, mj, are the number of business transactions in Pri, Prj, respectively, in the
general case mi 6¼ mj;
Opit 2 Pri , Opjk 2 Prj - operations in specialized processes SPpi, SPpj business
process BP;
Opit(Stek(i)) (or [Opit(Stek(i))] 2 SPpi), Opjk(Stek(j)) (or [Opjk(Stek(j))] 2 SPpj,), -
Opit, (or Opit(Stek(i))), Opjk, (or Opjk(Stek(j))) operations performed in situations of
Stek(i) и Stek(j), which Stek(i) 2 Sst and Stek(j) 2 Sst.
At the lower level performs the selection of procedures. For each operator, which
correspond to several procedures, therefore, based on the current situation for one
operator, one of the procedures is selected.
Thus, the logical model of the chosen i-th specialized process sets a complete list of
operations and their sequence of execution based on the admissibility requirement,
which is chosen (by the strategy) by the strategic model.
5 Conclusion
The authors of this work represent a business process as a formalized process in which
certain all types of resources, performers, owners of all types of processes are necessary
to achieve the ultimate goal of the process.
Each type of security is achieved by separate processes, which are called special-
ized business process processes. Each specialized process is modeled by a separate
model. To make the business process manageable, a control loop model is introduced,
consisting of the model:
• a strategic model that will ensure the adoption and implementation of strategic
decisions on the order of implementation of specialized processes,
• a logical model that determines the sequence of execution of operators of spe-

cialized processes after strategic decision-making and its implementation,
• a service model, which is defined by the control functions in the form of services.
of Kazakhstan (Grant No. 0118PК01084).
References
1. Van der Aalst, W.M.P.: On the automatic generation of workflow processes based on product
structures. Comput. Ind. 39(2), 97105–97111 (1999)
2. Vlkner, P., Werners, B.: A decision support system for business process planning. Eur.
J. Oper. Res. 125(3), 633647 (2000)
3. Zhang, Y., Feng, S.C., Wang, X., Tian, W., Wu, R.: Object-oriented manufacturing resource
modelling for adaptive process planning. Int. J. Prod. Res. 37(18), 41794195 (1999)
4. Zhang, F., Zhang, Y.F., Nee, A.Y.C.: Using genetic algorithms in process planning for job
shop machining. IEEE Trans. Evol. Comput. 1(4), 278289 (1997)
5. Duisebekova, K., Serbin, V., Ukubasova, G., Kebekpayeva, Z., Skakova, A., Rakhmetu-
layeva, S., Shaikhanova, A., Duisebekov, T., Kozhamzharova, D.: Design and development
of automation system of business processes in educational activity. J. Eng. Appl. Sci. 8,
4702–4714. ISSN:86-949X (2017) (Medwell Journals)
6. Dabbas, R.M., Chen, H.-N.: Mining semiconductor manufacturing data for productivity
improvementan integrated relational database approach. Comput. Ind. 45(1), 2944 (2001)
7. Musa, M.A., Oman M.S., Al-Rahimi W.M.: Ontology driven knowledge map for enhancing
business process reengineering. J. Comput. Sci. Eng. 3(6), 11 (2013) (Academy & Industry
Research Collaboration Center (AIRCC))
8. Rao, L., Mansingh, G., Osei-Bryson, K.-M.: Building ontology based knowledge maps to
assist business process re-engineering. J. Decis. Support Syst. 52(3), 577–589 (2012)
The Traveling Salesman Drone Station
Location Problem
Daniel Schermer(B) , Mahdi Moeini , and Oliver Wendt
Chair of Business Information Systems and Operations Research (BISOR),

Technische Universität Kaiserslautern, 67663 Kaiserslautern, Germany
{daniel.schermer,mahdi.moeini,wendt}@wiwi.uni-kl.de
Abstract. In this paper, we introduce the Traveling Salesman Drone

Station Location Problem (TSDSLP). The TSDSLP exhibits features of
the Traveling Salesman, Facility Location, and Parallel Machine Schedul-
ing problems. More precisely, given a truck located at a central depot,
a multitude of possible drone stations, and a set of customer locations,
the TSDSLP seeks for a feasible routing of the truck and drone opera-
tions such that all requests are fulfilled, no more than a given number of
drone stations is used, and the makespan (or operational cost) is min-
imized. We formulate the TSDSLP as a Mixed Integer Linear Program
(MILP) and use the state-of-the-art solver Gurobi to obtain solutions for
small- and medium-sized instances. Through our numerical results, we
show that the utilization of drone stations might reduce the makespan
significantly.
Keywords: Traveling salesman problem · Drone station · Drones ·

Last-mile delivery · Logistics
1 Introduction
Drones are on the verge of becoming a proven commercial technology for civil
applications in many public and private sectors. In particular, drones have
already been successfully applied for surveillance or monitoring tasks in agri-
culture, energy, or infrastructure, and for the delivery of packages (see [9] and
references therein).
Murray and Chu introduced two novel NP-hard problems where a truck is
assisted by a drone in last-mile delivery [7]. The first one is called the Flying Side-
kick Traveling Salesman Problem (FSTSP). In this case, if the depot is remotely
located from the demand centers, it can be beneficial to have the drone working
in close collaboration with the truck. To this end, they assume that the drone
is taken along by the truck and might be launched at some locations to initi-
ate an autonomous delivery. After fulfilling the requested order, the drone must
return to the truck in order to be resupplied with a new package. During recent
years, a fast-growing number of research papers have been published, which fol-
low the general concept of the FSTSP such that a high degree of synchronization
between trucks and drones is required (see, e.g., [1,10–13,15]).
https://doi.org/10.1007/978-3-030-21803-4_111
1130 D. Schermer et al.
By contrast, if most recipients are located in close proximity to the central

depot, as an alternative to the FSTSP, Murray and Chu propose the Parallel
Drone Scheduling Traveling Salesman Problem (PDSTSP). In this case, while
the truck continues a tour and serves remote locations, the drone might be used
to serve targets in close proximity to the depot, continuously departing from it
to initiate a delivery and returning to it for picking up a new package. As trucks
and drones are less dependent in the PDSTSP compared to the FSTSP, a much
smaller degree of synchronization is required. Therefore, given specific techni-
cal constraints such as drone’s limited payload weight and range of operation,
many research studies are only concerned with the routing of drone fleets from
a depot (see, e.g., [3]). Similar considerations have been applied to the cases,
where it might be necessary to determine the best location for multiple drone
depots without taking care of routing trucks (see, e.g., [2]). These considerations
involve, in the first place, a strategic perspective on establishing drone depots.
To overcome the limitations of drones that are restricted to an existing (central)
depot, Kim and Moon introduced the Traveling Salesman Problem with a Drone
Station (TSP-DS) as a generalization of the PDSTSP [4]. In the TSP-DS, there
is a single truck located at a central depot and a single drone station. Once the
drone station has been supplied by the truck, the drones (located at the sta-
tion) are used for distributing parcels. As in the PDSTSP, the objective of the
TSP-DS is to serve all customers with minimal makespan. In particular, Kim
and Moon can show that, under special assumptions, it is possible to determine
a priori which customers should be served by truck or drone, respectively. In
these special cases, it is possible to decompose the problem into an independent
Traveling Salesman Problem in order to determine the route of the truck and
a Parallel Machine Scheduling Problem to generate the operations at the drone
station. To the best of our knowledge, there is no other published research arti-
cle addressing challenges of the location problem in the context of the PDSTSP.
However, the general domain of combined location-routing problems is not new
(see, e.g., [6,8], and references therein).
In this paper, we introduce the Traveling Salesman Drone Station Location
Problem (TSDSLP). In this problem, we address a research gap by considering
the routing of the truck, facility location of the drone stations, and scheduling
of drones in an integrated model. In the TSDSLP, we assume that a single truck
is located at a central depot and a set of demand as well as a multitude of pos-
sible drone station locations are given. We require that the truck is sufficient to
serve all customers with regards to its capacity and the scheduling horizon. The
TSDSLP asks for a feasible route and drone operations, such that all customers
are served, no more than a permitted number of drone stations are used, and
minimal makespan (or operational cost) is achieved. To this end, a subset of
drone stations, that might accommodate a given number of autonomous drones,
can be visited by the truck. If the truck supplies a station with parcels, then the
station is considered as open, and the drones, that are resting there, can begin to
deliver these packages in parallel. In an effort to reduce delivery times and costs,
this concept might be a viable solution for the integration of autonomous units
The Traveling Salesman Drone Station Location Problem 1131
in last-mile and same-day delivery. In particular, compared to existing spacious

depots at remote locations, drone stations might be a small-sized and low-cost
infrastructure that provides shelter for parcels and drones as well as a recharging
possibility for the latter in close proximity to demand centers [4].
The remainder of this paper is organized is follows. In Sect. 2, we specify the
assumptions of the TSDSLP and formulate it as a Mixed Integer Linear Program
(MILP). Computational experiments and their numerical results are presented
in Sect. 3. Finally, concluding remarks are drawn in Sect. 4.
2 Problem Definition
In this section, we introduce the TSDSLP that covers a more general case than
the PDSTSP and TSP-DS [4,7]. More precisely, the TSDSLP not only integrates
possibility of drones deliveries into the TSP tours, but also assumes the presence
of multiple drone stations, which might be opened to be used for drone deliveries.
Suppose that a single truck located at a depot, a multitude of drone sta-
tions that can accommodate a fixed number of drones, and a set of customer
locations, each of them with a single demand, are given. In the TSDSLP, the
objective consists in minimizing the makespan (or operational cost) such that all
customers are served, either by the truck or by a drone. Furthermore, we accept
the following assumptions regarding the nature of drones [1,7,10–13,15]:
– When launched from a station, each drone can fulfill exactly one request and
then it needs to return to the same station from which it was launched to
be resupplied for future missions. Without loss of generality, we assume that
each customer is eligible to be served by a drone.
– We assume that a drone has a limited endurance of E distance units per
operation. After returning to the station, the battery of the drone is recharged
(or swapped) instantaneously.
– While the truck is subject to the limitations of the road network, the drones
might be able to use a different trajectory for traveling between locations.
Furthermore, based on their technical constraints, the speed of the truck and
drone might differ. Hence, without loss of generality, the average velocity of
each truck is assumed to be equal to 1 and the relative velocity of each drone
is assumed to be α ∈ R+ times the velocity of the truck.
– We do not consider explicit service times; however, such considerations might
be easily integrated into the model.
– We assume that at most C ∈ Z+ drone stations can be opened and they
are free of charge, except if we minimize the operational cost (see Sect. 2.2).
Furthermore, we require that the potential sites of these stations have been
determined already.
Let us present the notation that we are going to use throughout the paper.
Assume that a complete and symmetric graph G = (V, E) is given, where V is the
set of vertices and E is the set of edges. The set V contains n vertices associated
with the customers, named VN = {1, . . . , n}, a set of m possible drone stations
VS = {s1 , . . . , sm }, and two extra vertices 0 and n + 1 that (in order to simplify
the notation and the mathematical formulation) both represent the same depot
location at the start and end of the truck’s tour. Thus, V = {0}∪VN ∪VS ∪{n+1},
where 0 ≡ n+1. To simplify the notation, we introduce the sets VL = V \{n+1}
and VR = V \ {0}.
By the parameters dij and dij we define the distance required to travel from
vertex i to vertex j by truck and drone, respectively. In principle, as the drones
are not limited to the road network, the distances might differ. For the purpose
of this work, Euclidean distance is considered for both the truck and drones. We
use the parameters v = 1 and v = α · v to define the (constant) velocity of the
truck and drones. Hence, the time required to traverse edge (i, j) is defined as
tij = dij and tij = dij /α for the truck and drones, respectively.
We assume that each drone station accommodates a limited and identical
number of drones D := {1, . . . , Dn }, where Dn ∈ Z>0 . Each drone may travel
a maximum span of E distance units per operation, where a drone operation is
characterized by a triple (d, s, j) as follows: the drone d ∈ D is launched from
a drone station s ∈ VS , fulfills a request at j ∈ VN , and returns to the same
station from which it was launched.
Figure 1 shows an example of a TSDSLP instance and potential TSP and
TSDSLP solutions.
1 3
D s1 s2
2 4
Fig. 1. A TSDSLP with a depot D, four customers VN = {1 . . . 4}, two drone stations
VS = {s1 , s2 } that can accommodate two drones each, a TSP solution (middle figure)
and a TSDSLP solution (right figure) in which a station is utilized for two deliveries
2.1 Minimal Makespan TSDSLP
In order to formulate the TSDSLP, we introduce the following decision variables:
τ ∈ R≥0 : continuous variable that defines the makespan.

xij ∈ {0, 1} : is equal to 1, iff arc (i, j) is part of the truck’s route.
∀i∈VL ,j∈VR
xsij ∈ {0, 1} : is equal to 1, iff arc (i, j) is part of the truck’s route to the drone station s.
∀i∈VL ,j∈VR ,s∈VS
d ∈ {0, 1}
ysj : is equal to 1, iff drone d serves j from station s.
∀s∈VS ,j∈VN ,d∈D
zs ∈ {0, 1} : is equal to 1, iff drone station s is opened.
∀s∈VS
Using this notation, we have the following MILP formulation of the TSDSLP:
min τ, (1)

s.t. tij xij ≤ τ, (2)
i∈VL j∈VR
i=j

tij xsij + d
2 · tsj · ysj ≤ τ : ∀s ∈ VS , d ∈ D, (3)
i∈VL j∈VR j∈VN
i=j
d
xij + ysj = 1 : ∀j ∈ VN , (4)
i∈VL , s∈VS d∈D
i=j

x0j = xi,n+1 = 1, (5)
j∈VR i∈VL

xik − xkj = 0 : ∀k ∈ VN ∪ VS , (6)
i∈VL j∈VR
i=k k=j

xij ≤ |S| − 1 : ∀S ⊂ V, {0, n + 1} ∈
/ S, |S| > 1, (7)
i∈S j∈S
i=j
xsij ≤ xij : ∀s ∈ VS , i ∈ VL , j ∈ VR , i = j, (8)

s
x0j = 1 : ∀s ∈ VS , (9)
j∈VR

xsik − xskj = 0 : ∀s ∈ VS , k ∈ VN ∪ VS , s = k, (10)
i∈VL j∈VR
i=k j=k

xsis − xis = 0 : ∀s ∈ VS , (11)
i∈VL i∈VL
i=s i=s

xis ≥ zs : ∀s ∈ VS , (12)
i∈VL
i=s

zs ≤ C, (13)
s∈VS
d
ysj ≤ nzs : ∀s ∈ VS , (14)
d∈D j∈VN
d
2 · dsj · ysj ≤ E : ∀s ∈ VS , d ∈ D, j ∈ VN . (15)
In this model, the objective function (1) minimizes the makespan τ . Con-
straints (2) and (3) describe τ mathematically. More precisely, constraint (2)
sets the time spent traveling by the truck (to serve customer and supply sta-
tions) as a lower bound on the objective value. In constraints (3), for each station
s, we account for the time until the truck has reached the station and then, for
each drone d located at the station, the time spent fulfilling requests. These
values are summed up to define lower bounds on τ .
Constraints (4) guarantee that each request j is served exactly once by either
the truck or a drone. The flow of the truck is defined through constraints (5)–(6).
More precisely, constraints (5) ensure that the truck starts and concludes its tour
exactly once. For each customer or drone station, constraints (6) guarantee that
the flow is preserved, i.e., the number of incoming arcs must equal the number
of outgoing arcs. Moreover, constraints (7) serve as classical subtour elimination
constraints, i.e., for each proper non-empty subset of vertices S (that does not
contain the depot), no more than |S| − 1 arcs can be selected within this set.
Constraints (8)–(11) specify the route of the truck that leads to each drone
station s. To this end, constraints (8) ensure that this path must follow the path
of the truck. Moreover, constraints (9) guarantee that the departure from the
depot is always a part of each route to a station. Furthermore, for each vertex
k that might be located in between the depot and the station, constraints (10)
preserve the flow. In addition, for each station s that is visited by the truck,
constraints (11) guarantee that there is exactly one arc leading to the station.
As specified by constraints (12), a drone station is opened only if it is visited
by the truck. Moreover, constraint (13) guarantees that at most C drone stations
may be opened. Constraints (14) restrict the number of drone operations that
can only be performed at opened drone stations. Constraints (15) determine
the drone stations’ range of operation. Note that these constraints might be
effectively handled during preprocessing. Finally, according to the definition of
the decision variables, τ ∈ R≥0 and the other decision variables are binary.
In place of constraints (7), it is possible to adapt the family of Miller-Tucker-
Zemlin (MTZ) constraints, using auxiliary integer variables ui , as follows [5]:
u0 = 1, (16)
2 ≤ ui ≤ n + m + 2 : ∀i ∈ VR , (17)
ui − uj + 1 ≤ (n + m + 2)(1 − xij ) : ∀i ∈ VL , j ∈ VR , i = j, (18)
ui ∈ Z+ : ∀i ∈ V. (19)
2.2 Minimal Operational Cost TSDSLP
As an alternative to the makespan minimization, we might be interested in cost

minimization instead. In this case, we consider only the variable cost-per-mile
that might be associated with the truck and drones and the fixed cost of using
a station. To this end, we might use the following objective function:

d
min ct dij xij + cd (dsj + djs )ysj + fs zs (20)
i∈VL j∈VR s∈VS d∈D j∈VN s∈VS
i=j
where, fs is the fixed cost of opening and using the station s, and the parameters
ct , cd ∈ R+ determine the relative cost for each mile that the truck and drones
are in operation. In this case, the model can remain unchanged with the excep-
tion that it is not necessary to consider the variables τ , xsij and the respective
constraints associated with these variables.
3 Computational Experiments
We implemented the model (1)–(15) and solved it by the MILP solver Gurobi
Optimizer 8.1.0. Throughout the solution process, the subtour elimination con-
straints (7) were treated as lazy constraints. More precisely, whenever the solver
determines a new candidate incumbent integer-feasible solution, we examine if
it contains any subtour. If no subtour is contained, we have a feasible solution.
Otherwise, we calculate the set of vertices S that is implied by the shortest sub-
tour contained in the current candidate solution. For this set S, constraint (7) is
added as a cutting plane and the solver continues with its inbuilt branch-and-cut
procedure. For comparative purposes, we solved also the alternative formulation
of the TSDSLP in which the MTZ constraints (16)–(19), in place of (7), are
used. We carried out all experiments on single compute nodes in an Intel Xeon
E5-2670 CPU cluster where each node was allocated 8 GB of RAM. A time limit
of 10 min was imposed on the solver.
We generated the test instances according to the scheme shown in Fig. 2.
More precisely, we considered a 32 × 32 km2 square grid where the customer
locations VN = {1, . . . , n}, n ∈ {10, 30, 50} were generated under uniform dis-
tribution. Furthermore, we assumed that the drone stations are located at the
coordinates (x, y) ∈ (8, {8, 24}) ∪ (24, {8, 24}). Moreover, we considered a cen-
tral depot at (x, y) = (16, 16). We investigated different cases; more precisely,
the basic one follows the assumption of Murray and Chu [7], where drones have
a maximum range of operation of E = 16 km. Therefore, the radius of opera-
tion associated with each station is Er = E/2 = 8 km. In order to broaden our
experiments, we tested the model for two other values of Er .
Fig. 2. A visualization of the scheme according to which instances are generated
In order to study the influence of problem parameters on the solver and the
solutions, we considered their domains as follows. We tested for three different
possible values for C, i.e., C ∈ {1, 2, 3}, and also did experiments for three
distinct number of identical drones that a drone station can hold, i.e., |D| ∈
{1, 2, 3}. Moreover, we let the relative velocity α be one of the values from
{0.5, 1, 2, 3} and we assumed that the operational radius Er ∈ {8, 12, 16}.
For each value of n ∈ {10, 30, 50}, we generated 10 random instances, which
(along with the drone stations and the location of the depot) specify the graph G
(refer to [14] for the instances). Furthermore, based on our choice of parameters
C, |D|, α and E, we have 3 · 3 · 4 · 3 = 108 parameter vectors. Therefore, we have
a total of 30 · 108 = 3240 problems that are solved through both formulations.
Table 1 contains the numerical results of our computational experiments
using two different formulations of the TSDSLP. In particular, for each num-
ber of customers n, the permitted number of stations C, and the operational
radius Er , this table shows the average run-time t (in seconds) as well as the
average MIP gap. While comparing the results of two formulations, we observe
that the differences on instances with n = 10 are negligible; however, on larger
instances, the prohibition of subtours through lazy constraints improves the aver-
age run-times and MIP gaps significantly. Although medium-sized instances can
be solved within reasonable time, the run-time depends strongly on n and the
parameters.
Table 1. Influence of the instance size n, the number of permitted stations C, the
radius of operation Er , and the formulation on the run-time (s.) as well as MIP gap
n C MILP (1)–(15) with lazy constraints MILP (1)–(6), (8)–(19)

Er = 8 Er = 12 Er = 16 Er = 8 Er = 12 Er = 16
t Gap t Gap t Gap t Gap t Gap t Gap
10 1 1.1 0.0% 1.3 0.0% 1.4 0.0% 1.9 0.0% 1.8 0.0% 2.0 0.0%
2 1.1 0.0% 1.7 0.0% 2.3 0.0% 1.6 0.0% 2.1 0.0% 2.9 0.0%
3 1.0 0.0% 1.7 0.0% 3.1 0.0% 1.6 0.0% 2.0 0.0% 3.0 0.0%
30 1 16.1 0.0% 28.0 0.0% 37.3 0.0% 83.2 0.0% 145.9 0.2% 198.8 0.3%
2 43.9 0.0% 80.0 0.0% 178.8 0.6% 187.6 0.9% 301.5 2.3% 423.8 5.4%
3 67.2 0.0% 227.6 0.9% 331.6 3.3% 228.9 1.6% 350.2 4.6% 442.0 8.7%
50 1 113.5 0.0% 292.4 0.2% 379.6 1.1% 430.8 2.1% 562.6 8.4% 591.0 13.2%
2 289.1 0.5% 534.8 5.3% 583.9 11.4% 505.4 4.9% 600.2 18.9% 600.3 24.0%
3 385.5 2.7% 549.8 13.7% 591.2 23.0% 526.6 7.2% 599.7 23.2% 597.0 28.4%
For the purpose of illustrating the benefits of utilizing the drone stations
with regards to makespan reduction, we introduce the following metric, where τ
∗
is the objective value returned by the solver and τTSP is the optimal objective
value of the TSP (that does not visit or use any drone station):
τ
Δ = 100% − ∗ (21)
τTSP
Figure 3 highlights the numerical results. More precisely, this figure shows
the average savings over the TSP, i.e., Δ, based on the number of permitted
stations C, the number of drones located at each station |D|, as well as the
drones’ relative velocity α and radius of operation Er . Overall, we can distinguish
two cases. If the radius of operation is small (Er = 8, solid lines) and the number
of permitted stations C is fixed, the savings are nearly independent from the
number of drones at each station and their velocity and radius of operation. In
this case, the number of customers that can be served by the drones is limited
(see Fig. 2). However, even a slow-moving drone can effectively serve most (or all)
customers within its radius of operation. An increase in the number of drones (or
their relative velocity) will in this case not improve the overall makespan. On the
other hand, if the radius of operation is large (Er = 16, dashed lines), there is a
significant impact of these parameters on the savings. In this case, the makespan
can be reduced effectively by increasing the number of drones (or their relative
velocity). Furthermore, it is worth to highlight that, in many cases, significant
savings are already possible with few drones (per station) that have a relative
velocity of α ∈ {0.5, 1} but a large operational range. This contrasts problems
that follow the fundamental idea of the FSTSP, where drones with relatively
small endurance but fast relative velocity are often preferred [1,10–13].
Permitted Stations C
1 2 3
60
|D| = 1
50
|D| = 2
Savings Δ [%]
40 |D| = 3
30
20
10
0
0.5 1.0 2.0 3.0 0.5 1 2 3 0.5 1 2 3
Relative Velocity α
Fig. 3. The savings Δ for different values of the problem parameters (averaged over
all instances). Solid and dashed lines correspond to Er = 8 and Er = 16, respectively
4 Conclusion
In this work, we introduced the Traveling Salesman Drone Station Location
Problem (TSDSLP), which combines Traveling Salesman Problem and Facility
Location Problem in which facilities are drone stations. After formulating the
problem as a MILP, we presented the results of our computational experiments.
According to the numerical results, using suitable drone stations can bring sig-
nificant reduction in the delivery time.
Since TSDSLP defines a new concept, the future research directions are
numerous, e.g., a research idea might consist in studying the case of using mul-
tiple trucks in place of a single one. Another research direction might focus in
design of efficient solution methods. In fact, the standard solvers are able to solve
only small TSDSLP instances; hence, we might design effective heuristics, which
can address large-scale instances. The research in this direction is in progress
and the results will be reported in the future.
References
1. Agatz, N., Bouman, P., Schmidt, M.: Optimization approaches for the traveling
salesman problem with drone. Transp. Sci. 52(4), 965–981 (2018)
2. Chauhan, D., Unnikrishnan, A., Figliozzi, M.: Maximum coverage capacitated facil-
ity location problem with range constrained drones. Transp. Res. Part C: Emerg.
Technol. 1–18 (2019)
3. Dorling, K., Heinrichs, J., Messier, G.G., Magierowski, S.: Vehicle routing problems
for drone delivery. IEEE Trans. Syst. Man Cybern. Syst. 47(1), 70–85 (2017)
4. Kim, S., Moon, I.: Traveling salesman problem with a drone station. IEEE Trans.
Syst. Man Cybern. Syst. 49(1), 42–52 (2018)
5. Miller, C.E., Tucker, A.W., Zemlin, R.A.: Integer programming formulation of
traveling salesman problems. J. ACM 7(4), 326–329 (1960)
6. Min, H., Jayaraman, V., Srivastava, R.: Combined location-routing problems: a
synthesis and future research directions. Eur. J. Oper. Res. 108(1), 1–15 (1998)
7. Murray, C.C., Chu, A.G.: The flying sidekick traveling salesman problem: opti-
mization of drone-assisted parcel delivery. Transp. Res. Part C: Emerg. Technol.
54, 86–109 (2015)
8. Nagy, G., Salhi, S.: Location-routing: issues, models and methods. Eur. J. Oper.
Res. 177(2), 649–672 (2006)
9. Otto, A., Agatz, N., Campbell, J., Golden, B., Pesch, E.: Optimization approaches
for civil applications of unmanned aerial vehicles (UAVs) or aerial drones: a survey.
Networks 72(4), 411–458 (2018)
10. Schermer, D., Moeini, M., Wendt, O.: Algorithms for solving the vehicle routing
problem with drones. In: LNCS, vol. 10751, pp. 352–361 (2018)
11. Schermer, D., Moeini, M., Wendt, O.: A variable neighborhood search algorithm
for solving the vehicle routing problem with drones (Technical report), pp. 1–33.
BISOR, Technische Universität Kaiserslautern (2018)
12. Schermer, D., Moeini, M., Wendt, O.: A hybrid VNS/Tabu search algorithm for
solving the vehicle routing problem with drones and en route operations. Comput.
Oper. Res. 109, 134–158 (2019). https://doi.org/10.1016/j.cor.2019.04.021
13. Schermer, D., Moeini, M., Wendt, O.: A matheuristic for the vehicle routing prob-
lem with drones and its variants (Technical report), pp. 1–37. BISOR, Technische
Universität Kaiserslautern (2019)
14. Schermer, D., Moeini, M., Wendt, O.: Instances for the traveling salesman
drone station location problem (TSDSLP) (2019). https://doi.org/10.5281/zenodo.
2594795
15. Wang, X., Poikonen, S., Golden, B.: The vehicle routing problem with drones:
several worst-case results. Optim. Lett. 11(4), 679–697 (2016)
Two-Machine Flow Shop with a Dynamic
Storage Space and UET Operations
Joanna Berlińska1 , Alexander Kononov2 , and Yakov Zinder3(B)

1
Faculty of Mathematics and Computer Science, Adam Mickiewicz University,
Poznań, Poland
Joanna.Berlinska@amu.edu.pl
2
Sobolev Institute of Mathematics, Novosibirsk, Russia
alvenko@math.nsc.ru
3
School of Mathematical and Physical Sciences, University of Technology Sydney,
Ultimo, Australia
Yakov.Zinder@uts.edu.au
Abstract. The paper establishes the NP-hardness in the strong sense

of a two-machine flow shop scheduling problem with unit execution time
(UET) operations, dynamic storage availability, job dependent storage
requirements, and the objective to minimise the time required for the
completion of all jobs, i.e. to minimise the makespan. Each job seizes
the required storage space for the entire period from the start of its pro-
cessing on the first machine till the completion of its processing on the
second machine. The considered scheduling problem has several applica-
tions, including star data gathering networks and certain supply chains
and manufacturing systems. The NP-hardness result is complemented
by a polynomial-time approximation scheme (PTAS) and several heuris-
tics. The presented heuristics are compared by means of computational
experiments.
Keywords: Two-machine flow shop · Makespan · Dynamic storage ·

Computational complexity · Polynomial-time approximation scheme
1 Introduction
This paper presents a proof of the NP-hardness in the strong sense and a
polynomial-time approximation scheme (PTAS) for the two-machine flow shop,
where the duration of each operation is one unit of time and where, in order
to be processed, each job requires a certain amount of an additional resource,
which will be referred to as a storage space (buffer). The storage requirement
varies from job to job, and the availability of the storage space (buffer capacity)
varies in time. The goal is to minimise the time needed to complete all jobs.
The presented computational complexity results are complemented by several
heuristics which are compared by means of computational experiments.
The considered problem arises in star data gathering networks where datasets
from the worker nodes are to be transferred to the base station for processing.
https://doi.org/10.1007/978-3-030-21803-4_112
1140 J. Berlińska et al.
Data transfer can commence only if the available memory of the base station is
not less than the size of the dataset that is to be transferred. The amount of
memory, occupied by a dataset, varies from worker node to worker node. Only
one node can transfer data to the base station at a time, although during this
process the base station can process one of the previously transferred datasets.
The memory, consumed by this dataset, is released only at the completion of pro-
cessing the dataset by the base station. The base station has a limited memory
whose availability varies in time due to other processes.
The existing publications on scheduling in the data gathering networks
assume that the exact time, needed for transferring each dataset, and the exact
time, required by the base station for its processing after this transfer, are known
in advance (see, for example, [1–3,10]). In reality, the exact duration of trans-
ferring a dataset and the duration of its processing by the base station may be
difficult to estimate, and only an upper bound may be known. In such a situa-
tion, the allocation of dataset independent time slots for transferring data and
for processing a dataset by the base station, may be a more adequate option.
This approach is analysed in this paper. The paper also relaxes the assumption,
which is normally made in the literature (see, for example, [3]), that, during the
planning horizon, the available amount of memory remains the same.
Another area that is relevant to this paper is transportation and manufactur-
ing systems where two consecutive operations use the same storage space that is
allocated to a job at the beginning of its first operation and is released only at
the completion of this job. For example, in supply chains, where goods are trans-
ported in containers or pallets with the consecutive use of two different types
of vehicles, the unloading of one vehicle and the loading onto another normally
require certain storage space. Although the storage requirements of different con-
tainers or pallets can vary significantly, the durations of loading and unloading
by a crane practically remain the same regardless of their sizes.
The considered scheduling problem can be stated as follows. The jobs, com-
prising the set N = {1, ..., n}, are to be processed on two machines, machine
M1 and machine M2 . Each job should be processed during one unit of time on
machine M1 (the first operation of the job), and then during one unit of time
on machine M2 (the second operation of the job). Each machine can process at
most one job at a time, and each job can be processed on at most one machine
at a time. If a machine starts processing a job, it continues its processing until
the completion of the corresponding operation, i.e. no preemptions are allowed.
A schedule σ specifies for each j ∈ N the point in time Sj (σ) when job j starts
processing on machine M1 and the point in time Cj (σ) when job j completes
processing on machine M2 . In order to be processed each job j requires ωj
units of an additional resource. These ωj units are seized by job j during the
time interval [Sj (σ), Cj (σ)). At any point in time t the available amount of the
resource is specified by the function Ω(t), i.e. any schedule σ, at any point in
time t, should satisfy the condition

ωj ≤ Ω(t).
{j: Sj (σ)≤t<Cj (σ)}
Two-Machine Flow Shop with a Dynamic Storage Space 1141
It is assumed that the buffer capacity Ω(t) changes only at integer t.

The processing of jobs commences at time t = 0, and the goal is to find a
schedule that minimises the makespan
Cmax (σ) = max Cj (σ).

j∈N
In what follows, if it is clear what schedule is considered, the notation Sj (σ) and
Cj (σ) will be replaced by Sj and Cj .
A schedule is a permutation schedule if the order in which the jobs are pro-
cessed on machine M1 , and the order in which the jobs are processed on machine
M2 , are the same. In the case of arbitrary processing times, an optimal schedule
that is also a permutation schedule may not exist even if the availability of stor-
age space does not change [7]. Furthermore, in this case, the problem remains
NP-hard in the strong sense even if the order in which the jobs are processed
on one of the machines, is given [9]. In contrast to the case of arbitrary pro-
cessing times, for the problem with the unit execution time (UET) operations,
considered in this paper, there always exists an optimal schedule, where each
job starts processing on machine M2 at the moment when it completes its pro-
cessing on machine M1 . So, one can assume that the resource is allocated to
operations rather than to jobs. This makes this paper relevant to the publica-
tions on resource constrained scheduling with UET operations [4,5,11,12], which
consider a flow shop with an additional resource and assume that the resource
is allocated to operations. This paper contributes to the body of literature on
resource constrained scheduling by presenting the results for the case when the
availability of the resource changes in time.
The rest of the paper is organised as follows. Section 2 presents a proof that
the problem is strongly NP-hard. Section 3 describes a polynomial-time approxi-
mation scheme. An integer linear program approach can be found in Sect. 4. The
proposed heuristic algorithms are described in Sect. 5, and the results of their
comparison by computational experiments are given in Sect. 6. The last section
comprises conclusions.
2 NP-Hardness in the Strong Sense

The computational complexity result, presented in this section, is obtained by
a reduction from the following Numerical Matching with Target Sums decision
problem, which is NP-complete in the strong sense [8].
INPUT: three sets {x1 , ..., xr }, {y1 , ..., yr } and {z1 , ..., zr } of positive integers,
where
r r r
xi + yi = zi (1)
i=1 i=1 i=1
QUESTION: do there exist permutations (i1 , ..., ir ) and (j1 , ..., jr ) of the indices
1, ..., r such that zk = xik + yjk for all k ∈ {1, ..., r}?
Theorem 1. The considered scheduling problem is NP-hard in the strong sense.
Proof. Let
x = max xi and Z = x + max yi + max zi
1≤i≤r 1≤i≤r 1≤i≤r
and let the storage availability (capacity) be determined by the function

⎧
⎪
⎪ 2Z + x if 0 ≤ t < 1;
⎪
⎪
⎪
⎪ 3Z + x + z k if 3k − 2 ≤ t < 3k − 1
⎨
for k ∈ {1, ..., r};
Ω(t) = . (2)
⎪
⎪ 2Z + x if 3k − 1 ≤ t < 3k + 1
⎪
⎪
⎪
⎪ for k ∈ {1, ..., r − 1};
⎩
2Z + x if 3r − 1 ≤ t.
It will be shown that the jobs {1, ..., 2r} with the storage requirements

2Z + xi if 1 ≤ i ≤ r;
ωi = (3)
Z + yi−r + x if r + 1 ≤ i ≤ 2r
can be completed in the time interval [0, 3r] if and only if the corresponding
instance of the Numerical Matching with Target Sums has answer YES.
Suppose that there exists a schedule whose makespan is less than or equal
to 3r. Observe that for any point in time t such that 0 ≤ t ≤ 3r and any job j,
ωj ≤ Ω(t), but if at this point in time Ω(t) = 2Z + x, then at most one job can
use the buffer at t.
Since, after the completion of a job, the next job cannot be completed earlier
than after one unit of time and since all time intervals where Ω(t) = 3Z + x + zk ,
i.e. when several jobs can use the storage space simultaneously, are disjoint
unit time intervals, at any point in time at most two jobs can use the buffer
simultaneously.
The total processing time of all jobs is 4r, whereas the total length of all
time intervals before the point in time 3r where Ω(t) = 2Z + x and therefore
where at most one job can be processed is 2r. Hence, taking into account that
the makespan is 3r, the remaining 2r units of processing time are allocated to
the disjoint unit time intervals of total length r. In other words, in each of these
unit time intervals, the machines process concurrently two jobs and each of these
two jobs is processed during the entire unit time interval.
On the other hand, if the same job is processed in two disjoint time intervals
[3k − 2, 3k − 1] and [3k − 2, 3k − 1], where 1 ≤ k < k ≤ r, then no jobs
are processed in the time interval (3k − 1, 3k − 2) where the buffer capacity is
2Z + x. This contradicts the assumption that the makespan does not exceed 3r,
because only 2r units of processing time can be allocated to the disjoint time
intervals where jobs can be processed concurrently, and the remaining 2r units
must be allocated to the time intervals where the buffer capacity is 2Z + x.
If two jobs that are processed concurrently are both from the set {r+1, ..., 2r},
then it leaves only r − 2 jobs from {r + 1, ..., 2r} for processing in the remaining
r − 1 disjoint unit time intervals. Hence, in this case, in at least one such unit
time interval, two jobs from the set {1, ..., r} are processed concurrently, which
violates the buffer capacity. Consequently, for each 1 ≤ k ≤ r, in the time interval
[3k − 2, 3k − 1], some job ik ∈ {1, ..., r} is processed concurrently with some job
jk ∈ {r + 1, ..., 2r}, and for any two different k and k from {1, ..., r} we have
ik = ik and jk = jk .
For each 1 ≤ k ≤ r, since ik and jk are processed concurrently, they satisfy
the inequality
ωik + ωjk ≤ 3Z + x + zk
which, by virtue of (3), implies xik + yjk ≤ zk which, in turn, by virtue of (1)
gives
xik + yjk = zk .
Suppose now that the instance of the Numerical Matching with Target Sums
has answer YES, i.e. there exist permutations (i1 , ..., ir ) and (j1 , ..., jr ) of the
indices 1, ..., r such that zk = xik + yjk for all k ∈ {1, ..., r}. Then, the schedule
where, for each 1 ≤ k ≤ r, Sik = 3(k − 1) and Sjk +r = 3k − 2 has the required
makespan of 3r.
3 Polynomial-Time Approximation Scheme

nε
2
For any ε > 0 denote k = 2 and q = ε .
Theorem 2. For any given small ε > 0, a schedule σ such that
Cmax (σ) ≤ (1 + ε)Cmax (σ ∗ ), (4)
∗ 2 q
where σ is an optimal schedule, can be constructed in O(q n ) time.
Proof. The proof is based on the idea in [6]. Assume that there are sufficiently
many jobs and number them in a nondecreasing order of their storage require-
ments, i.e. ω1 ≤ ... ≤ ωn . For each job j, replace its storage requirement ωj
by a new one (denoted αj ) as follows. For each 1 < e ≤ q − 1 and each
(e − 1)k < j ≤ ke, let αj = ωke and, for each k(q − 1) < j ≤ n, let αj = ωn .
Observe that any schedule for the problem with the new storage requirements
is feasible for the problem with the original storage requirements.
An optimal schedule for the new storage requirements can be constructed by
dynamic programming as follows. For 1 ≤ e ≤ q, let

ke if 1 ≤ e < q
π(e) = ,
n if e = q
and consider (q + 1)-tuples (n1 , ..., nq , i) such that (a) 0 ≤ ne ≤ k, for all 1 ≤
e ≤ q − 1, and 0 ≤ nq ≤ n − k(q − 1); and (b) 1 ≤ i ≤ q and ni > 0. Each
(q + 1)-tuple represents n1 + ... + nq jobs such that, for each 1 ≤ e ≤ q, this set
contains ne jobs j, which αj is ωπ(e) .
For each (q + 1)-tuple (n1 , ..., nq , i), let F (n1 , ..., nq , i) be the minimal time
needed for completion of all jobs, corresponding to (n1 , ..., nq , i), under the con-
dition that the job with the largest completion time among these jobs is a job
with the new storage requirement ωπ(i) . Consequently, the optimal makespan is
C = min F (k, ..., k, n − (q − 1)k, i).
1≤i≤q
The (q + 1)-tuples, satisfying the condition n1 + ... + nq = 1, will be referred

to as boundary (q + 1)-tuples. Then, F (n1 , ..., nq , i) = 2 for each boundary
(n1 , ..., nq , i). For any positive integer t, any 1 ≤ i ≤ q and any 1 ≤ e ≤ q, let

1 if ωπ(i) + ωπ(e) ≤ Ω(t)
Wi,e (t) = .
2 if ωπ(i) + ωπ(e) > Ω(t)
Then, the values of F for all (q + 1)-tuples that are not boundary are computed,
using the following recursive equations:
F (n1 , ..., ni + 1, ..., nq , i) = min [F (n1 , ..., nq , e) + Wi,e (F (n1 , ..., nq , e) − 1)].
{e: ne >0}
The dynamic programming algorithm above constructs an optimal schedule σ

in O(q 2 nq ) time, and the only what remains to show is that (4) holds.
Let σ ∗ be an optimal schedule for the problem with the original storage
requirements. This schedule can be converted into a schedule η for the problem
with the new storage requirements as follows. For each job j such that 1 ≤ j ≤
n − k, let Cj (η) = Cj+k (σ ∗ ) and, for each job j such that n − k < j, let
Cj (η) = Cmax (σ ∗ ) + 2(j − n + k).
Then, taking into account that n < Cmax (σ ∗ ),
Cmax (σ ∗ ) ≤ C ≤ Cmax (η) ≤ Cmax (σ ∗ )+2k ≤ Cmax (σ ∗ )+nε ≤ (1+ε)Cmax (σ ∗ )
which completes the proof.
4 ILP Formulation
In this section, we formulate our problem as an integer linear program. Since we
assumed that ωj ≤ Ω(t) for all j and t, the schedule length never exceeds 2n.
Let T ≤ 2n be an upper bound on the optimum schedule length Cmax . As the
available buffer size may change only at integer moments, for any nonnegative
integer t the buffer size equals Ω(t) in the whole interval [t, t + 1). For each
j = 1, . . . , n and t = 0, . . . , T − 1 we define binary variables xj,t such that
xj,t = 1 if job j starts at time t, and xj,t = 0 in the opposite case. The minimum
schedule length can be found by solving the following integer linear program.
minimise Cmax (5)

n
ωj (xj,t + xj,t−1 ) ≤ Ω(t) for t = 1, . . . , T − 1 (6)
j=1
n
xj,t ≤ 1 for t = 0, . . . , T − 1 (7)
j=1
T
−1
xj,t = 1 for j = 1, . . . , n (8)
t=0
T
−1
txj,t + 2 ≤ Cmax for j = 1, . . . , n (9)
t=0
xj,t ∈ {0, 1} for j = 1, . . . , n, t = 0, . . . , T − 1 (10)
Constraints (6) guarantee that the jobs executed in interval [t, t + 1), where
1 ≤ t ≤ T − 1, fit in the buffer. Note that for t = 0 we have only one job running
in interval [t, t + 1), and hence, the buffer limit is also observed. At most one job
starts at time t by (7), and each job starts exactly once by (8). Inequalities (9)
ensure that all jobs are completed by time Cmax .
5 Heuristic Algorithms
Although the ILP formulation proposed in the previous section delivers opti-
mum solutions for our problem, it is impractical for larger instances because of
its high computational complexity. Therefore, in this section we propose heuris-
tic algorithms. Each of them constructs a sequence in which the jobs should be
processed. The jobs are started without unnecessary delay, as soon as the previ-
ous job completes on the first machine and sufficient amount of buffer space is
available.
Algorithm LF constructs the job sequence using a greedy largest fit rule.
Every time unit, we start on the first machine the largest job which fits in
the currently available buffer. If no such job can be found, the procedure con-
tinues after one time unit, when the buffer is released. This algorithm can be
implemented to run in O(n log n) time, using a self-balancing binary search tree.
However, if n is not very big, a simple O(n2 ) implementation that does not use
advanced data structures may be more practical.
Local search with neighborhoods based on job swaps proved to be a very
good method for obtaining high quality solutions for flow shop scheduling with
constant buffer space and non-unit operation execution times [3]. Therefore, we
also analyse algorithm LFLocal that starts with a schedule generated by LF,
and then applies the following local search procedure. For each pair of jobs, we
check if swapping their positions in the current sequence leads to decreasing the
schedule length. The swap that results in the shortest schedule is executed, and
the search is continued until no further improvement is possible.
Algorithm Rnd constructs a random job sequence in O(n) time. This algo-
rithm is used mainly to verify if the remaining heuristics perform well in com-
parison to what can be achieved without effort.
Algorithm RndLocal starts with a random job sequence, and then improves
it using the local search procedure described above. Analysing this heuristic will
let us know what can be achieved by local search if we start from a probably
low quality solution.
In this section, we compare the quality of the obtained solutions and the com-
putational costs of the proposed heuristics. The algorithms were implemented in
C++ and run on an Intel Core i7-7820HK CPU @ 2.90 GHz with 32 GB RAM.
Integer linear programs were solved using Gurobi. Due to limited space, we report
here only on a small subset of the obtained results. The generated instances and
solutions can be found at http://berlinska.home.amu.edu.pl/datasets/F2-UET-
buffer.zip.
40 1E+4
35 1E+3
30 1E+2
1E+1
25
1E+0
20
1E-1
15
1E-2
10 1E-3
5 1E-4
0 1E-5
1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0
ILP LF LFLocal Rnd RndLocal ILP LF LFLocal Rnd RndLocal
a) b)
Fig. 1. Results for n = 100 vs. δ. (a) Average quality, (b) average execution time.
Not all analysed instances could be solved to optimality using the ILP in
reasonable time. Therefore, we measure the schedule quality by the relative per-
centage error from the lower bound computed by Gurobi by solving the ILP in
1 h time limit. In most cases, this limit was enough to reach an optimal solution.
To illustrate this, in addition to the heuristic results, we also report on the qual-
ity of solutions returned by ILP after at most 1 h. For each analysed parameter
combination, we present the average results over 30 instances.
The test instances were generated as follows. In tests with n jobs, the buffer
requirements ωj were chosen randomly from the range [n, 2n]. Due to such a
choice of ωj range, the buffer requirements are diversified, but not very unbal-
anced. For a given δ, the available buffer space Ω(t) was chosen randomly from
the range [maxnj=1 {ωj }, δ maxnj=1 {ωj }], for each t = 0, . . . , 2n − 1 independently.
On the one hand, when δ is very small, the instances may be easy to solve
because there are not many possibilities of job overlapping, and the optimum
schedules are long. On the other hand, if δ is very big, there exist many pairs
of jobs that fit in the buffer together, which can also make instances easier.
Therefore, we tested δ ∈ {1.0, 1.1, . . . , 2.0} for n = 100. The obtained results are
presented in Fig. 1. All instances with δ ∈ {1.0, 1.1, 2.0} were solved by ILP to
optimality within the imposed time limit. For each of the remaining values of δ,
there were some tests for which only suboptimal solutions could be found within
an hour. The instances with δ ∈ [1.3, 1.6] seem the most difficult, as the average
running time of ILP for these values is above 1000 s. In this set of instances, the
heuristic algorithms have the worst solution quality for δ = 1.6. Hence, in the
next experiment we use δ = 1.6, in order to construct demanding instances.
We analysed the performance of our heuristics for n = 10, 20, . . . , 100. The
quality of solutions delivered by the respective algorithms is presented in Fig. 2a.
All instances with n ≤ 40 were solved to optimality by the ILP algorithm in the
1 h time limit. In each remaining test group, there were several instances for
which the optimum solution was not found within this time. Still, the largest
average error of one-hour ILP, obtained for n = 90, is only 0.25%. As expected,
the worst results are obtained by algorithm Rnd. Algorithm RndLocal delivers
much better schedules, which shows that our local search procedure can improve
a poor initial solution. On the contrary, the differences between the results deliv-
ered by LF and LFLocal are very small. For most instances, the schedules deliv-
ered by LF and LFLocal are identical, because they are local optimums. The
quality of results returned by all algorithms gets worse with growing n. The
number of jobs has the strongest influence on algorithms Rnd and RndLocal. Its
impact on LF and LFLocal is much smaller, and the changes in the quality of
ILP results are barely visible. The quality of results produced by all algorithms
seems to level off for n ≥ 50.
40 1E+4
35 1E+3
30 1E+2
1E+1
25
1E+0
20
1E-1
15
1E-2
10 1E-3
5 1E-4
0 1E-5
10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100
ILP LF LFLocal Rnd RndLocal ILP LF LFLocal Rnd RndLocal
a) b)
Fig. 2. Results for δ = 1.6 vs. n. (a) Average quality, (b) average execution time.
The average execution times of the algorithms are shown in Fig. 2b. Naturally,
algorithms Rnd and LF, each of which generates only one job sequence, are the
fastest. The impact of n on their running times is very small, because they have
low computational complexity. Local search algorithms need more time, and are
affected by the growth of n. RndLocal is significantly slower than LFLocal. This
is caused by the fact that when we start from a random sequence, the local
search can really do some work, while in the case of LFLocal, we usually have
only 1 or 2 iterations of the search procedure. The ILP algorithm is the slowest,
and its running time increases fast with growing n.
All in all, the one-hour limited ILP returns optimum or near-optimum solu-
tions, but at a relatively high computational cost. In our experiments with chang-
ing n, algorithm LF delivers schedules within 12% from the optimum on average.
The worst results were obtained by LF for the tests with δ = 2.0 (see Fig. 1a),
but the average error was still below 14%. The running time of LF is several
orders of magnitude lower than that of ILP even for small instances, and the dif-
ference between them increases with the growth of n. Therefore, ILP should only
be used when getting as close as possible to the optimum is more important than
the algorithm’s running time. For the cases when 10–15% error is acceptable, we
recommend using algorithm LF.
7 Conclusions
To the authors’ knowledge, this article is the first paper attempting to explore the
two-machine flow shop with a dynamic storage space and job dependent storage
requirements. For the case of UET operations, the paper presents a proof of the
NP-hardness in the strong sense and a polynomial-time approximation scheme,
together with an integer linear program and several heuristics, characterised
by the results of computational experiments. Future research should include a
worst-case analysis of approximation algorithms.
References
1. Berlińska, J.: Scheduling for data gathering networks with data compression:
Berlińska. J. Eur. J. Oper. Res. 246, 744–749 (2015)
2. Berlińska, J.: Scheduling data gathering with maximum lateness objective. In:
Wyrzykowski, R. et al. (eds.) PPAM 2017, Part II. LNCS, vol. 10778, pp. 135–
144. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-78054-2 13
3. Berlińska, J.: Heuristics for scheduling data gathering with limited base station
memory. Ann. Oper. Res. (2019). https://doi.org/10.1007/s10479-019-03185-3. In
press
4. Blażewicz, J., Kubiak, W., Szwarcfiter, J.: Scheduling unit-time tasks on flow-shops
under resource constraints. Ann. Oper. Res. 16, 255–266 (1988)
5. Blażewicz J., Lenstra, J.K., Rinnooy Kan, A.H.G.: Scheduling subject to resource
constraints: classification and complexity. Discret. Appl. Math. 5, 11–24 (1983)
6. Fernandez de la Vega, W., Lueker, G.S.: Bin packing can be solved within 1 + ε in
linear time. Combinatorica 1(4), 349–355 (1981)
7. Fung, J., Zinder, Y.: Permutation schedules for a two-machine flow shop with
storage. Oper. Res. Lett. 44(2), 153–157 (2016)
8. Garey, M.R., Johnson, D.S.: Computers and intractability: a guide to the theory
of NP-completeness. Freeman, San Francisco (1979)
9. Gu, H., Memar, J., Kononov, A., Zinder, Y.: Efficient Lagrangian heuristics for the
two-stage flow shop with job dependent buffer requirements. J. Discret. Algorithms
52–53, 143–155 (2018)
10. Luo, W., Xu, Y., Gu, B., Tong, W., Goebel, R., Lin, G.: Algorithms for communi-
cation scheduling in data gathering network with data compression. Algorithmica
80(11), 3158–3176 (2018)
11. Röck, H.: Some new results in no-wait flow shop scheduling. Z. Oper. Res. 28(1),
1–16 (1984)
12. Süral, H., Kondakci, S., Erkip, N.: Scheduling unit-time tasks in renewable resource
constrained flowshops. Z. Oper. Res. 36(6), 497–516 (1992)
Author Index
A Buisson, Martin, 981

Abdallah, Lina, 228 Bujok, Petr, 202
Abu, Kuandykov, 861, 1119
Aigerim, Bolshibayeva, 861, 882 C
Aiman, Moldagulova, 761, 842 Caillard, Simon, 1033
Akhtar, Taimoor, 672, 681 Candelieri, A., 751
Alharbi, Mafawez, 949 Cao, Hung-Phi, 740, 769
Alsyouf, Imad, 1078 Caristi, Giuseppe, 702
Altherr, Lena C., 916 Cen, Xiaoli, 135
Anton-Sanchez, Laura, 1013 Cexus, Jean-Christophe, 1097
Aoues, Younes, 991, 1001 Cheaitou, Ali, 1078
Arana-Jiménez, Manuel, 509 Chen, Boyu, 1067
Archetti, Francesco, 751 Climent, Laura, 617
Ashimov, Abdykappar, 850 Costa, M. Fernanda P., 16
Avraamidou, Styliani, 579
Aygerim, Aitim, 842 D
Da Silva, Gabriel, 893
B Dambreville, Frédéric, 1097
Bai, Hao, 991 Dan, Pranab K, 906
Ballo, Federico, 68 de Cursi, Eduardo Souza, 3, 238, 547, 557,
Barilla, David, 702 567, 991
Barkalov, Konstantin, 48 de Freitas Gomes, José Henrique, 600
Bassi, Mohamed, 547 de Oliveira, Welington, 957
Bednarczuk, Ewa M., 175 de Paiva, Anderson Paulo, 600
Bellahcene, Fatima, 279 de Paula, Taynara Incerti, 600
Benammour, Faouzi Mohamed, 341 Degla, Guy, 831
Bentobache, Mohand, 26 Delfino, Adriano, 477
Berenguer, Maria Isabel, 518 Demassey, Sophie, 957
Berlińska, Joanna, 1139 Deussen, Jens, 78
Bertozzi, Andrea L., 730 Devendeville, Laure Brisoux, 1033
Bettayeb, Maamar, 1078 Diaz, David Alejandro Baez, 1109
Bonvin, Gratien, 957 Dinara, Kozhamzharova, 1119
Borovskiy, Yuriy, 850 Ding, Wei, 468
Borowska, Bożena, 537 Doan, Xuan Vinh, 310
Buchheim, Christoph, 267 Dupin, Nicolas, 790
© Springer Nature Switzerland AG 2020 1149

https://doi.org/10.1007/978-3-030-21803-4
1150 Author Index
E I
Eddy, Foo Y. S., 247 Imanzadeh, S., 971
Einšpiglová, Daniela, 202 Imasheva, Baktagul, 820
Ellaia, Rachid, 3, 547
J
F Jarno, Armelle, 971
Fajemisin, Adejuyigbe, 617 Jemai, Zied, 1078
Fakhari, Farnoosh, 926 Jemmali, Mahdi, 949
Fampa, Marcia, 89, 267, 428 Ji, Sai, 488
Fenner, Trevor, 779 Jiang, Rujun, 145, 213
Fernandes, Edite M. G. P., 16 Jin, Zhong-Xiao, 1067
Fernández, José, 1013 Jouini, Oualid, 1078
Ferrand, Pascal, 981
Foglino, Francesco, 720 K
Frolov, Dmitry, 779 Kanzi, Nader, 702
Fuentes, Victor K., 89 Karpenko, Anatoly, 191
Fukuba, Tomoki, 937 Kasri, Ramzi, 279
Kassa, Semu Mitiku, 589
G Kaźmierczak, Anna, 128
G.-Tóth, Boglárka, 1013 Khalij, Leila, 567
Galán, M. Ruiz, 518 Khenchaf, Ali, 1097
Galuzzi, Bruno Giovanni, 751 Koliechkina, Liudmyla, 355
Gamez, Domingo, 518 Kononov, Alexander, 1139
Gao, Runxuan, 135 Korolev, Alexei, 398
Garmashov, Ilia, 398 Koudi, Jean, 831
Garralda–Guillem, A. I., 518 Kozinov, Evgeniy, 638
Gautrelet, Christophe, 567 Krishnan, Ashok, 247
Gawali, Deepak D., 58 Kronqvist, Jan, 448
Gergel, Victor, 638 Kulanda, Duisebekova, 1119
Ghaderi, Seyed Farid, 926 Kulitškov, Aleksei, 365
Ghosh, Tamal, 906 Kumar, Deepak, 257
Giordani, Ilaria, 751 Kumlander, Deniss, 365, 458
Gobbi, Massimiliano, 68
Goldberg, Noam, 871 L
Gomes, Guilherme Ferreira, 600 Le Thi, Hoai An, 289, 299, 320, 893, 1054
Granvilliers, Laurent, 99 Le, Hoai Minh, 893
Le, Thi-Hoang-Yen, 740
H Lebedev, Ilya, 48
Haddou, Mounir, 228 Lee, Jon, 89, 387, 438
Hamdan, Sadeque, 1078 Lefieux, Vincent, 893
Hartman, David, 119 Leise, Philipp, 916
Hennequin, Sophie, 1054, 1109 Lemosse, Didier, 567, 991, 1001
Henner, Manuel, 981 Leonetti, Matteo, 720
Hladík, Milan, 119 Li, Duan, 145, 213
Ho, Vinh Thanh, 1054 Li, Min, 488
Holdorf Lopez, Rafael, 238, 557 Li, Yaohui, 627
Homolya, Viktor, 109 Li, Zhijian, 730
Hu, Xi-Wei, 341 Liu, Wen-Zhuo, 330
Huang, Yaoting, 1067 Liu, Zhengliang, 691
Author Index 1151
Lu, Kuan, 611 Phan, Thuong-Cang, 740, 769

Lu, Wenlian, 1067 Pichugina, Oksana, 355
Lucet, Corinne, 1033 Pistikopoulos, Efstratios N., 579
Lucet, Yves, 257 Porošin, Aleksandr, 458
Lundell, Andreas, 448 Prestwich, Steven D., 417, 617
Luo, Xiyang, 730 Previati, Giorgio, 68
M Q
Ma, Ran, 1089 Qiu, Ke, 468
Martinsen, Kristian, 906
Melhim, Loai Kayed B., 949 R
Melo, Wendel, 428 Raissa, Uskenbayeva, 761, 810, 842, 861, 882
Migot, Tangi, 228 Rakhmetulayeva, Sabina, 861, 882, 1119
Mikitiuk, Artur, 407 Raupp, Fernanda, 428
Mirkin, Boris, 779 Redondo, Juana L., 1013
Mishra, Priyanka, 660 Regis, Rommel G., 37
Mishra, Shashi Kant, 182, 660 Rocha, Ana Maria A. C., 16
Mizuno, Shinji, 611 Rossi, Roberto, 417
Moeini, Mahdi, 1023, 1129 Roy, Daniel, 1054, 1109
Mohapatra, Ram N., 660 Ryoo, Hong Seo, 376
Mokhtari, Abdelkader, 26 Ryskhan, Satybaldiyeva, 842
Mukazhanov, Nurzhan K., 761
Mukhanov, S.B., 810 S
Muts, Pavlo, 498 Sadeghieh, Ali, 702
Myradov, Bayrammyrat, 526 Sagratella, Simone, 720
Sakharov, Maxim, 191
N Salewski, Hagen, 1023
Nakispekov, Azamat, 820 Samir, Sara, 299
Nascentes, Fábio, 238 Sampaio, Rubens, 238
Nascimento, Susana, 779 Sarmiento, Orlando, 267
Nataraj, Paluri S.V., 58 Sato, Tetsuya, 937
Naumann, Uwe, 78 Schermer, Daniel, 1129
Ndiaye, Babacar Mbaye, 831 Seccia, Ruggiero, 720
Nguyen, Duc Manh, 1097 Sergeev, Sergeĭ, 691
Nguyen, Huu-Quang, 221 Shahi, Avanish, 182
Nguyen, Phuong Anh, 289 Sheu, Ruey-Lin, 221
Nguyen, Viet Anh, 320 Shi, Jianming, 611
Nielsen, Frank, 790 Shiina, Takayuki, 937
Niu, Yi-Shuai, 330, 341 Shoemaker, Christine A., 672, 681
Nouinou, Hajar, 1054 Sidelkovskaya, Andrey, 820
Nowak, Ivo, 498 Sidelkovskiy, Ainur, 820
Nowakowski, Andrzej, 128 Simon, Nicolai, 916
Singh, Sanjeev Kumar, 182
O Singh, Vinay, 649
Onalbekov, Mukhit, 850 Skipper, Daphne, 387
Ortigosa, Pilar M., 1013 Speakman, Emily, 387
Subba, Mohan Bir, 649
P Sun, Jian, 1089
Pagnacco, Emmanuel, 547 Syga, Monika, 175
Parra, Wilson Javier Veloz, 1001
Patil, Bhagyesh V., 58, 247 T
Pelz, Peter F., 916 Taibi, S., 971
Perego, Riccardo, 751 Talbi, El-Ghazali, 790
Phan, Anh-Cang, 740, 769 Tarim, Armagan, 417
1152 Author Index
Tavakkoli-Moghaddam, R., 926 Xin, Jack, 730, 800

Telli, Mohamed, 26 Xu, Dachuan, 488, 713, 1089
Tohidifard, M., 926 Xu, Luze, 438
Tokoro, Ken-ichi, 937 Xu, Yicheng, 713
Toumi, Abdelmalek, 1097 Xue, Fanghui, 800
Tran, Bach, 893
Tran, Ho-Dat, 769 Y
Treanţă, Savin, 164 Yagouni, Mohammed, 299
Troian, Renata, 567 Yan, Kedong, 376
Trojanowski, Krzysztof, 407 Yang, Tianzhi, 135
You, Yu, 330, 341
U
Upadhyay, Balendu Bhooshan, 660 Z
Zagdoun, Ishy, 871
V Zălinescu, Constantin, 155
Vavasis, Stephen, 310 Zámečníková, Hana, 202
Vinkó, Tamás, 109 Zewde, Addis Belete, 589
Visentin, Andrea, 417 Zhang, Dongmei, 488, 713
Zhang, Hongbo, 991
W Zhang, Hu, 341
Wang, Bao, 730 Zhang, Xiaoyan, 1089
Wang, Shuting, 627 Zhang, Yong, 713
Wang, Wenyu, 681 Zhang, Yuanmin, 627
Wang, Yishui, 488 Zhang, Zebin, 981
Wang, Yong, 1043 Zheng, Ren, 1067
Wendt, Oliver, 1129 Zidani, Hafid, 3
Wu, Yizhong, 627 Zidna, Ahmed, 58, 247
Zinder, Yakov, 1139
X Zuldyz, Kalpeyeva, 842
Xia, Yong, 135, 221

Optimization of Complex Systems: Theory, Models, Algorithms and Applications

Uploaded by

Copyright:

Available Formats

You might also like

Optimization of Complex Systems: Theory, Models, Algorithms and Applications

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Optimization of Complex Systems: Theory, Models, Algorithms and Applications

Uploaded by

Copyright:

Available Formats

Advances in Intelligent Systems and Computing 991

More information about this series at http://www.springer.com/series/11156

Tao Pham Dinh

Tao Pham Dinh

ISSN 2194-5357 ISSN 2194-5365 (electronic)

optimization, DC programming and DCA, discrete optimization and network

July 2019 Hoai An Le Thi

Hoai An Le Thi University of Lorraine, France

Hoai An Le Thi University of Lorraine, France

Hoai Minh Le University of Lorraine, France

International Program Committee Members

Paula Amaral University NOVA de Lisboa, Portugal

Jon Lee University of Michigan, USA

Gerhard-Wilhelm Poznan University of Technology, Poland

Manuel University of Cádiz, Spain

Sabina International University of Information Technology,

Aharon Ben-Tal Israel Institute of Technology, Israel

Special Session Organizers

1. Combinatorial Optimization: Viet Hung Nguyen (Sorbonne University, France),

7. Complementarity Problems: Applications, Theory and Algorithms: Mounir

Organizing Committee Members

Hoai Minh Le University of Lorraine, France

Réseau de Transport d’Électricité, France

Diving for Sparse Partially-Reﬂexive Generalized Inverses . . . . . . . . . . 89

Solving a Type of the Tikhonov Regularization of the Total

DC Programming and DCA

Sentence Compression via DC Programming Approach . . . . . . . . . . . . 341

Discrete Optimization and Network Optimization

Solving an MINLP with Chance Constraint Using a Zhang’s

Optimization under Uncertainty

A Mixture Design of Experiments Approach for Genetic Algorithm

Data science: Machine Learning, Data Analysis, Big Data

The Practice of Moving to Big Data on the Case of the NoSQL

Economics and Finance

Energy and Water Management

Location Optimization of Gas Power Plants by a Z-Number

Transportation, Logistics, Resource Allocation and Production

A Planning Problem with Resource Constraints in Health

Haﬁd Zidani1,2(B) , Rachid Ellaia1 , and Eduardo Souza de Cursi2

Abstract. We consider the problem of minimizing a given function

Keywords: Global optimization · Genetic algorithm · Representation

x∗ = Arg min f (x) , (1)

In the literature, representation formulas have been introduced in order to char-

The formulation of Pincus corresponds to g (λ, s) = e−λs , what is a convenient

2 Hybrid Simplex Search with Representation Formula

– Classical Genetic Algorithm (GA). We obtain a new algorithm called

2.1 Representation Formula

As previously observed, if f attains its global minimum at exactly one point x∗

where g : R2 −→ R is continuous and strictly positive, s −→ g : (λ, s) is

simply in generating N admissible points (x1 , x2 , ..., xN ) ∈ S - and estimations

Extensive experimentations concerning the eﬀect of diﬀerent parameters have

4.1 Influence of the Pincus Function

4.2 Influence of the Population Size Used in GA

Table 1. Inﬂuence of the population size for GA and GANM

TestF Dim PopGA GA GANM

4.3 Influence of the Sample Size for RF

Table 2. Inﬂuence of the population size for RFGA and RFGANM

TestF Dim PopGA RFGA RFGANM