Download as pdf or txt
Download as pdf or txt
You are on page 1of 212

Honglei Xu · Song Wang

Soon-Yi Wu Editors

Optimization
Methods,
Theory and
Applications
Optimization Methods, Theory and Applications
Honglei Xu • Song Wang • Soon-Yi Wu
Editors

Optimization Methods,
Theory and Applications

123
Editors
Honglei Xu Song Wang
Department of Mathematics and Statistics Department of Mathematics and Statistics
Curtin University Curtin University
Perth, WA, Australia Perth, WA, Australia

Soon-Yi Wu
Department of Mathematics
National Cheng Kung University
Tainan, Taiwan

ISBN 978-3-662-47043-5 ISBN 978-3-662-47044-2 (eBook)


DOI 10.1007/978-3-662-47044-2

Library of Congress Control Number: 2015942905

Springer Heidelberg New York Dordrecht London


© Springer-Verlag Berlin Heidelberg 2015
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology
now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, express or implied, with respect to the material contained herein or for any
errors or omissions that may have been made.

Printed on acid-free paper

Springer-Verlag GmbH Berlin Heidelberg is part of Springer Science+Business Media (www.springer.


com)
Preface

The 9th International Conference on Optimization: Techniques and Applications


(ICOTA9) was held in National Taiwan University of Science and Technology,
Taipei, during December 12–16, 2013. As a continuation of the ICOTA series, the
goal of the 9th ICOTA is to provide a forum for scientists, researchers, software
developers, and practitioners to exchange ideas and approaches, to present research
findings and state-of-the-art solution techniques, to share experiences on potentials
and limits, and to open new avenues of research and developments on all issues and
topics related to optimization and its applications.
This conference consisted of 2 keynote addresses, 13 plenary lectures, and
56 technical sessions, and this book contains 10 chapters on recent advances in
optimization and optimal control presented at the conference. Each of the chapters
was accepted after a stringent peer review process by at least two independent
reviewers to ensure that the works are of high quality.
Chapter 1 establishes a mathematical technique to analyze human walking behav-
ior using dynamic optimization. The method works well for a complex movement
that involves a change in the dynamics from single support phase to double support
phase. Numerical results can replicate human walking motions and calculate the
optimal joint torques to produce the resulting motions. Chapter 2 studies two opti-
mization problems related to a class of elliptic boundary value problems on smooth
bounded domains of RN . These optimization problems are formulated as solvable
minimum and maximum problems related to the rearrangements of given functions.
Chapter 3 proposes a multiobjective optimization method that supports agile and
flexible decision-making to handle complex and diverse decision environments.
Chapter 4 investigates the existence of solutions in connection with variational-like
hemivariational inequalities in reflexive Banach spaces. Conditions for the existence
of solutions of the variational-like hemivariational inequalities involving lower
semicontinuous set-valued maps are established. Chapter 5 develops an inertial
algorithm and proves its weak convergence for solving the split common fixed-point
problem for demicontractive mappings in Hilbert space. It provides an efficient way
to study the split common fixed-point problem. Chapter 6 investigates a class of

v
vi Preface

multiobjective optimization problems with inequality, equality, and vanishing con-


straints. It shows that under mild assumptions, some constraint qualifications, such
as Cottle constraint qualification, Slater constraint qualification, and Mangasarian-
Fromovitz constraint qualification, are not satisfied. New Karush-Kuhn-Tucker-type
necessary optimality conditions are developed accordingly. Chapter 7 proposes a
new hybrid global optimization technique, where a gradient-based method with
BFGS update is combined with an Artificial Bee Colony, to solve an Archie
parameter estimation problem. This global optimization technique has both the
fast convergence of gradient descent algorithm and the global convergence of
swarm algorithm. Chapter 8 considers the regularization problem of a nonlinear
program. It examines inner connections among exact regularization, normal cone
identity, and the existence of a weak sharp minimum for certain associated nonlinear
programs. Chapter 9 presents a mathematical methodology that optimally solves
an inverse mixing problem when both the composition of the source components
and the amount of each source component are unknown. The model is used for
analyzing longitudinal proton magnetic resonance spectroscopy (1H MRS) data
gathered from the brains of newborn infants. It shows that the method can provide
more specific and accurate assessments of the brain cell types during early brain
development in neonates. It is also beneficial to study a wide range of physical
systems that involve mixing of unknown source components. Finally, Chap. 10
considers an optimal design problem of a DFT filter bank subject to subchannel
variation constraints. The design problem is transformed to a minimax optimization
problem, which is equivalent to a semi-infinite optimization problem. Moreover,
a computational procedure is proposed to solve such a semi-infinite optimization
problem. Simulations and comparisons also show the effectiveness of the results.
We would like to thank the organizing institutions and sponsors of the confer-
ence and this book, including the National Nature Science Foundation of China
(11171079, 11410301010) and Natural Science Foundation of Hubei Province
of China (2014CFB141). In editing this book, we have been assisted by many
voluntary colleagues, particularly the anonymous referees. Thus, we take this
opportunity to thank all the referees for their efforts and valuable comments. We
would also like to thank the authors of these chapters for their contributions and
patience. Last but not least, we would like to express our gratitude to the Springer
staff including Grace Guo, Emmie Yang, and Toby Chai for their professionalism
and help and to all who have, in one way or another, contributed to the publication
of this book.

Perth, WA, Australia Honglei Xu


Perth, WA, Australia Song Wang
Tainan, Taiwan Soon-Yi Wu
Contents

1 Analysing Human Walking Using Dynamic Optimisation .. . . . . . . . . . . . 1


Meiyi Tan, Leslie S. Jennings, and Song Wang
2 Rearrangement Optimization Problems Related to a Class
of Elliptic Boundary Value Problems.. . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 35
Chong Qiu, Yisheng Huang, and Yuying Zhou
3 An Extension of the MOON2 /MOON2R Approach
to Many-Objective Optimization Problems.. . . . . . . . .. . . . . . . . . . . . . . . . . . . . 51
Yoshiaki Shimizu
4 Existence of Solutions for Variational-Like
Hemivariational Inequalities Involving Lower
Semicontinuous Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 67
Guo-ji Tang, Zhong-bao Wang, and Nan-jing Huang
5 An Iterative Algorithm for Split Common Fixed-Point
Problem for Demicontractive Mappings . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 85
Yazheng Dang, Fanwen Meng, and Jie Sun
6 On Constraint Qualifications for Multiobjective
Optimization Problems with Vanishing Constraints . . . . . . . . . . . . . . . . . . . 95
S.K. Mishra, Vinay Singh, Vivek Laha, and R.N. Mohapatra
7 A New Hybrid Optimization Algorithm for the Estimation
of Archie Parameters .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 137
Jianjun Liu, Honglei Xu, Guoning Wu, and Kok Lay Teo
8 Optimization of Multivariate Inverse Mixing Problems
with Application to Neural Metabolite Analysis . . . .. . . . . . . . . . . . . . . . . . . . 155
A. Tamura-Sato, M. Chyba, L. Chang, and T. Ernst

vii
viii Contents

9 Exact Regularization, and Its Connections to Normal


Cone Identity and Weak Sharp Minima in
Nonlinear Programming .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 175
S. Deng
10 The Worst-Case DFT Filter Bank Design with Subchannel
Variations .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 183
Lin Jiang, Changzhi Wu, Xiangyu Wang, and Kok Lay Teo
Chapter 1
Analysing Human Walking Using Dynamic
Optimisation

Meiyi Tan, Leslie S. Jennings, and Song Wang

Abstract A mathematical model to simulate human walking motions and study


the dynamics behind walking is developed which is adjustable to accommodate
different cases such as the single and double support phases of walking. We first
propose a technique for estimating joint moments and position coordinates of body
segments using the method of inverse dynamics. The estimates are then used as
initial joint torques for solving the model as an optimal control problem with the
setup of appropriate objective functions and constraints. Numerical experiments on
the developed model and solution technique have been performed and the numerical
results show that the model is able to replicate human walking motions and the
optimal joint torques can be calculated to produce the resulting motions.

1.1 Introduction

The study of human motion has been of considerable interest in the field of
biomechanics. It provides detailed information to understand the human movements
that enable certain motions to be improved or made safer. The desire to understand
the mechanics behind walking has spurred the study of human locomotion (McGeer
1988) but due to its complex nature, modelling and understanding human walking
continues to be a challenging research problem in multibody systems (Hardt et al.
1999). Movement patterns can be predicted as best as possible using mathematical
models, however models can become very complicated while trying to model a
human body and its movements as closely as possible due to a body’s complexity
(Alexander 1996, 2003).
In order to replicate human walking motion realistically, a complete gait cycle,
comprising of two continuous steps, should be considered. Each step is made up

M. Tan • L.S. Jennings


School of Mathematics & Statistics, The University of Western Australia, 35 Stirling Highway,
Crawley, WA 6009, Australia
S. Wang ()
Department of Mathematics and Statistics, Curtin University, GPO Box U1987, Perth,
WA 6845, Australia
e-mail: Song.Wang@curtin.edu.au

© Springer-Verlag Berlin Heidelberg 2015 1


H. Xu et al. (eds.), Optimization Methods, Theory and Applications,
DOI 10.1007/978-3-662-47044-2_1
2 M. Tan et al.

of two phases namely, single support phase and double support phase. The single
support phase occurs when one foot contacts the ground while the other leg is
swinging from rear to front, starting from the rear foot toe-off and ending when
the swinging foot lands on the ground with a heel strike. The double support phase
begins with the heel strike of the forward swing foot and ends with the toe-off of the
rear foot. As such, Bessonnet et al. (2004) described single support phase as moving
like an open tree-like kinematic chain while double support phase is kinematically
closed and overactuated.
The kinematic configuration of the model biped may change going through
one phase to the other during the collision of the foot with the ground which
results in jump conditions on the velocities (Hardt et al. 1999). This could have
been a contributing reason as to why most early research only considered single
support or assumed an instantaneous double support phase (Ren et al. 2007).
Instantaneous double support phase was being considered for walking simulations
in later research, followed by studies which considered a complete step consisting of
both the single support and double support phases (Xiang et al. 2010). In addition,
the foot segment was often neglected or assumed to be flat on the floor during
stance.
In order to achieve realistic human walking motions, both phases are necessary
and should be incorporated in the model. Modelling a biped with feet will allow
the modelling of the double support phase, from swing heel strike to stance toe off.
Hardt et al. (1999) suggested that feet bring about the addition of ankle torques and
liftoff force that is produced as the heel comes off the ground. They contemplated
that at a higher speed, the biped cannot walk effortlessly without the inclusion of
a foot. The addition of ankle actuation generated a smoother walking motion and
allowed torque inputs at hip to be distributed to the knees and ankles. Another
advantage of having a foot as a segment was the ability to distribute center of
pressure from the rear to the front of the foot during ground contact (Bullimore
and Burn 2006).
The double support phase of the gait cycle is deemed a common example of a
closed-loop problem. In a closed-loop model such as the double support phase, the
number of actuating torques is usually more than the number of degrees of freedom,
presenting a redundancy problem for inverse dynamics and control applications to
resolve (Ünver et al. 2000).
The inverse dynamics problem of biomechanics has been the most common
method used to estimate muscle forces during locomotion (Anderson and Pandy
2001) since it is computationally inexpensive and solutions can be obtained
relatively quickly on single-processor computers. However, Marshall (1985) and
Selles et al. (2001) both acknowledged that a major problem in the inverse dynamics
approach is the need for numerical differentiation of potentially noisy position data.
Even though inverse dynamics have been commonly used to estimate joint torques
during locomotion (Anderson and Pandy 2001), they produce poor results in the
presence of noisy measurements (Kuo 1998). In addition, to model a complete gait
cycle, both single and double support phases have to be considered. However, the
redundancy problem of the double support phase, leads to a state of indeterminacy,
1 Analysing Human Walking Using Dynamic Optimisation 3

unless ground reaction forces are known. The inverse dynamics method is not able
to directly solve the equations of motion for that phase as seen in Ren et al. (2007)
model.
The dynamic optimisation method integrates the equation of motion during an
optimisation process to simulate motions and solve for the optimal joint forces of
the model (Chow and Jacobson 1971; Anderson and Pandy 2001; Pandy 2001). Esti-
mating the trajectories of joint torques using this method is practical as the method
applies a forward simulation to reproduce a best observed motion (Chao and Rim
1973). A major disadvantage of dynamic optimisation is that it is more expensive
computationally (Yamaguchi and Zajac 1990), and hence has led to solutions for
walking being greatly simplified. A large amount of computation is required to
compute the trajectory of joint torques when using dynamic optimisation due to
the choice of initial guesses of torque value and the mathematical sophistication
required to understand the technique (Koh and Jennings 2003). However, as a
general rule of dynamic optimisation, an initial estimate set of torque trajectories is
required to start the optimisation. In Koh (2001), Koh used the conventional method
of inverse dynamics to find the torque trajectories as initial estimates to speed up
the convergence of the numerical process (Chao and Rim 1973). Koh had managed
to incorporate the use of both methods by applying the inverse dynamics method
to determine the initial torque estimates to be used in the dynamic optimisation
method.
Most research tends to avoid modelling double support phase as it, being a
closed loop problem, produces redundancy problem and complicates the modelling.
Studies that consider modelling both single and double support phases have chosen
to employ the method of inverse dynamics and optimsation to solve the system of
both phases, and had two different algorithms to deal with each phase. The purpose
of this study is to develop a generalised model applying optimal control theory, to
simulate normal walking motion through both the single support and double support
phases, and to gain insight into the mechanics that are involved in the overall motion.
Since normal walking can be assumed to have symmetric and cyclic characteristics,
only one step of the gait cycle needs to be modelled (Hardt et al. 1999; Anderson and
Pandy 2001; Xiang et al. 2009) with appropriate semi-periodic boundary conditions.
A combination of both inverse dynamics method and dynamic optimisation method
will be used to solve the equations of motion to model the human motion. Joint
torques and forces obtained will be studied to understand the mechanics behind
certain movements.
One major contribution is to have one model able to solve for one step cycle in
walking. The model formulated in this study can be adjusted according to cases,
hence the single support and double support phases can be solved for individually
or together. As the aim of the study is to simulate walking for one step cycle, both
phases are considered and solved.
An advantage the study brings is the application of the conventional inverse
dynamics method to obtain initial estimates of joint torques to be used in dynamic
optimisation, which is the approach taken in this study to produce the observed
motion and improve joint torques estimates. An initial joint torques “guess” from
4 M. Tan et al.

applying the inverse dynamics method can reduce computation time in computing
joint torque trajectories. In addition, dynamic optimisation solves the problem by
integrating the equation of motion forward in time, so simulation is performed
in a manner consistent with the development of motion in humans, allowing the
research to evaluate the effects of changes in ‘muscle activity’ on the outcome of
the movement.
The rest of this paper is organised as follows. In the next section, we will first set
up the geometry and the mathematical model of the walking model. In Sect. 1.3, we
will establish the objective function, its corresponding constraints and discuss the
inverse analysis used to obtain the initial joint estimates. In Sect. 1.4, we present
the analysis of numerical experiments to confirm that the mathematical model
formulated replicated human walking motions.

1.2 Model Development

1.2.1 Geometry

We formulate a mathematical model for an efficient and versatile computation of


the forward dynamics for a two-dimensional (2D) system of planar linked rigid
bodies. The formulation of the model was adopted from Koh (2001), whose study
was to optimise performance on a Yurchenko vault. It is similar to Kuo (1998) except
with more formalised and simplified notations with respect to the topology of the
model. A link segment model was used to represent the human body (Fig. 1.2). The
seven link segments (n D 7) representing the human body were assumed to move
in the two-dimensional (2D) sagittal plane. The head, arms and trunk (HAT) were
represented by one segment, assuming that arms did not swing excessively (Cavagna
et al. 1976; Dean 1965), so as to change the points of center of mass (CoM) of HAT.
Each of the feet was represented by a fixed triangle of appropriate shape and the
forward point of the triangle was placed about 0.04 m in front of the metatarsal joint
to partially compensate for the toe action during the later stages of pushoff. Joints
between segments represent the ankles, knees and hip. The link segment model is
similar to that used in Onyshko and Winter (1980), which is deemed to provide a
good compromise between complexity and the accurate representation of the real
situation. An increase in the number of segments will increase the complexity of
resulting equations of motion rapidly. However, less than seven segments greatly
reduce the accuracy of the model.
Anthropometric data for the model, including segment masses and centre of
mass (CoM) positions were determined by using the anthropometric proportions
and regression equations given by Winter (2009). As we did not use any subjects,
we assumed an adult height of 1.7 m and weight of 65 kg. We assumed each segment
to be similar to a rod and calculated moments of inertia of each segment based on
1 Analysing Human Walking Using Dynamic Optimisation 5

the equation used to calculate moments of inertia of a rod, with the axis of rotation
at the CoM of the segment.
The ith segment has length `i , mass mi and moment of inertia Ii about its center
p p
of mass (CoM)(within the segment), which is a distance ri from proximal .xi ; yi /
and distance li  ri from the distal .xi ; yi / end (Fig. 1.1). All segments are labelled
d d

to have a proximal and distal end. The proximal end of a segment is situated closest
to point of contact while the distal end of a segment is situated furthest from point
of contact. In this case, we have considered the point of contact to be the ball of the
stance foot. There is a global coordinate system, XOY, which has an origin fixed in
an infinite mass “ground”. Each segment’s position is known from its CoM (xi , yi )
and the angle i that the segment makes with the positive x-axis. The CoM of the
whole body, (X, Y), is given by
   Pn  Xn
X 1 mi x i
D Pn
iD1 where M D mi :
Y M iD1 mi yi iD1

Fig. 1.1 An ith segment diagram


6 M. Tan et al.

Walk Direction

Segment 7
(Head, Arms, Trunk)

7 6

Segment 4 Segment 3
(Swing Thigh) (Stance Thigh)

4 3
Segment 5
(Swing Shank)
Segment 2
(Stance Shank)

5
Segment 6 2
(Swing Foot) Segment 1 1
X (x6d,y6d) (Stance Foot)
(x1 p,y1 p)

Fig. 1.2 Seven-segment model

For a 2D connected body, knowing all i , i D 1; : : : ; n, and either one segment’s


positional coordinate or the CoM (X, Y) gives the position of every segment.
p p
Defining .x1 ; y1 / as the coordinates of the proximal end of the first segment
(Segment 1) allows a convenient and concise way to describe the positions of the
CoM of the segments, as this point is fixed throughout the movement. The positional
equations for a chain of segments (with a multiple branch on joint 4 where Segment
3, Segment 4 and Segment 7 join, see Fig. 1.2) are
p p
x D x1 e C LDc e; y D y1 e C LDs e;
1 Analysing Human Walking Using Dynamic Optimisation 7

where x D .x1 ; : : : ; xn /t , y D .y1 ; : : : ; yn /t , e D .1; 1; : : : ; 1/t . Here, (xi ; yi ) denotes


the position of CoM for segment i, where i D 1; : : : ; n, n denoting the number of
segments.

Dc D diag.cos 1 ; : : : ; cos n /; Ds D diag.sin 1 ; : : : ; sin n /

and
2 3
r1 0 0 0 0 0 0
6l 0 0 0 0 07
6 1 r2 7
6l 0 0 0 07
6 1 l2 r3 7
6 7
L D 6 l1 l2 l3 r4 0 0 0 7:
6 7
6 l1 l2 l3 l4 r5 0 07
6 7
4 l1 l2 l3 l4 l5 r6 05
l1 l2 l3 0 0 0 r7

The position of CoM of the whole system (MX, MY) can likewise be written in
matrix-vector form as MX D mt x and MY D mt y, where mt D .m1 ; : : : ; mn /. Hence
the relations, using mt e D M,
p p
MX D mt x D Mx1 C mt LDc e and MY D mt y D My1 C mt LDs e:

The distal end of Segment 6 of the chain of segments, has co-ordinates

6
X 6
X
p p
xd6 D x1 C li cos i and yd6 D y1 C li sin i :
iD1 iD1

The segments are ordered so that L is lower triangular despite having multiple
branches.

1.2.2 The Topology

To describe the topology of the body, each joint is labelled with a number, k, k D
1; : : : ; j where j D n C 1 for a body with no loops, in other words, a tree-structured
body. Some of these joints will be in contact with the ground while some are free
or constrained on a curve. This allows for externally applied forces on any joint, in
particular, from the infinite mass ground. The proximal incidence matrix is a j  n
matrix Ap where
(
p 1; if segment i has proximal end at joint k;
Aki D
0; otherwise:
8 M. Tan et al.

The distal incidence matrix Ad is similarly defined,


(
1; if segment i has distal end at joint k;
Adki D
0; otherwise:

These two matrices define the topology of the body and Ap C Ad defines the vertex-
arc, or joint-directed segment incidence matrix of a digraph (directed graph). The
joint-external contact incidence matrix B (j  e) is defined as
(
1; if joint k contacts the ground at external contact i;
Bki D
0; otherwise;

where e is the number of external contacts. The possible contacts considered are at
joints numbered 1, 2, 6 and 7 which are the toes and heels of the body, where for
model simplicity, the heels are a rigid extension of the ankles.
Suppose we have a joint k, which has two proximal ends, namely i and b incident
on it and one distal end a incident on it. An external force f ek acts on the joint. The
reaction forces on segment i come from joints k proximal and .k C 1/ distal and are
p
denoted f i and f di for the proximal and distal forces respectively. It is these forces
which supply the rotational and translational motions to segment i and are given by
   px   dx 
fkex p fi fi
f ek D ey ; f i D py ; f d
i D dy :
fk fi fi

The balance of forces at any joint k takes the form, “the sum of all reaction forces
at joint k equals the external force at joint k”. From Fig. 1.3,
 px   px   dx   ex 
fi fb fa fk
py C py C C ey D 0:
fi fb fady fk

Using the incidence matrices, these equations for all joints can be combined into
matrix equations:

Ap f px C Ad f dx C B.f ex / D 0;
Ap f py C Ad f dy C B.f ey / D 0;

where f px is a vector of proximal x-component reaction forces for each segment


and f dx is the corresponding distal x component reaction forces, similarly the y-
components. The vectors f ex ey
 and f are the joint external
 forces. The two equations
above can be simplified to A D A j B j A
p d

Af x D 0;
Af y D 0;
1 Analysing Human Walking Using Dynamic Optimisation 9

Fig. 1.3 Forces on joint k

where the proximal, distal and external forces are ordered such that
2 3 2 3
f px f py
f x D 4 f ex 5 ; f y D 4 f ey 5 :
f dx f dy

When there are no external contacts at any joint, the external forces are zero.

1.2.3 Translational Equations of Motion

The translational equations of motion for an n-segment model can be derived by


differentiating the equations of positions of the CoM of each segment twice to obtain
the acceleration, multiplied by the mass and equating to the forces on each segment.
p p p p
Let xP 1 D u1 , so xR 1 D uP 1 and yP 1 D v1 , yR 1 D vP 1 . The translational equations can
be written as

P C Sf x D Jx !2 ;
mPu1 C Jy !
P C Sf y D gm  J y !2 :
mvP 1  J x !
10 M. Tan et al.

To cater for both contact and free flight dynamics, where free flight dynamics is
p p
described when there is no external contact to the ground, during contact at .x1 ; y1 /,
.Pu1 ; vP 1 / is zero and non-zero when not in contact. Hence the translational equations
during contact are:

P C Sf x D Jx !2 ;
Jy !
P C Sf y D gm  Jy !2 :
Jx !
p
In the case of non-contact, we would require the dynamic equations, xR 1 D uP 1 and
p
yR 1 D vP1 to be included.
P !
We have defined:  D .1 ; : : : ; n /t , ! D , R !2 D .! 2 ; : : : ; !n2 /t ,
P D , 1
S D ŒIn ; 0; In .
Note that Jx D Dm LDc and J y D Dm LDs , where Dm D diag.m/.

1.2.4 Rotational Equations of Motion

For the general case, the moment equations of each segment i about the segment
CoM are given by
p px py dy
Ik !P i D i C id C fi ri sin i  fidx .li  ri / sin i  fi ri cos i C fi .li  ri / cos i ;

where i D 1 to n. The moment equation can similarly be expressed in matrix form as

P C Mx f x C My f y D T;
J!

where J D diag.I1 ; I2 ; : : : ; In /,
 
Mx D Ds Dr j 0 j Dl  Dr ,
 
My D Dc Dr j 0 j .Dl  Dr / ,
Dr D diag.r1 ; r2 ; : : : ; rn /, Dl D diag.l1 ; l2 ; : : : ; ln /.
The external forces do not appear explicitly in these equations as they act on the
joints and transfer the forces to the segments attached to the joint.
The vector  is a vector of the proximal torques appropriately ordered. These
torques can be considered as making the angle between the segments larger or
smaller. It is possible for a torque to act on a segment separated by other segments.
In this case, we assume that only one torque acts between two segments about a
common joint and that the torques are given functions of time.
A consistent notation to specify the torques needs to be established and is given
by the matrix T, which is a matrix of zeros, ones and negative ones that describes
which segments have torques acting on them. A positive torque on the distal end of a
segment contributes a negative angular acceleration to the segment, while a positive
torque on the proximal end of a segment contributes a positive angular acceleration.
1 Analysing Human Walking Using Dynamic Optimisation 11

This holds true up till Segment 6, as Segment 7 is part of a multiple branch at joint
4, together with Segment 3 and Segment 4 (Fig. 1.2). At joint 4 (hip joint), we have
a torque between Segment 3 and Segment 7, and another between Segment 4 and
Segment 7, which will be opposite of each other.
The matrix T with a torque acting between Segment 6 and the external world is
given by

1 2 3 4 5 6 7
seg1 Π1 0 0 0 0 0 0 
seg2 Π1 1 0 0 0 0 0 
seg3 Π0 1 0 0 0 1 0 
TD :
seg4 Π0 0 1 0 0 0 1 
seg5 Π0 0 1 1 0 0 0 
seg6 Π0 0 0 1 1 0 0 
seg7 Π0 0 0 0 0 1 1 

Note that the sum of the each column and row has to be zero with the exception of
5 and Segment 6 in the case of one contact at Segment 1. 1 acts between Segment
1 and the external world and there is no torque between Segment 6 and the external
world.

1.2.5 Dynamics of the Model

The equations for the second order variables and forces are looked at, taking account
of variables which are zero for a time interval.
The complete equations where eight rows of the equations are included or not,
to specify the cases, are presented. Hence all dynamic variables and all forces are
included, whether zero or not. But first, the two row vectors lts and ltc are defined so
as to relate distal Segment 6 (toe, or joint 7) to proximal Segment 1 (toe or joint 1),
or proximal Segment 2 (ankle (heel) or joint 2),

1l
t
D Œl1 ; l2 ; l3 ; l4 ; l5 ; l6 ; 0; or 2l
t
D Œ0; l2 ; l3 ; l4 ; l5 ; l6 ; 0;
t
1 ls D 1 l t Ds ; or t
2 ls D 2 l t Ds ;
t
1 lc D 1 l t Dc ; or t
2 lc D 2 l t Dc ;

where for proximal Segment 1


   p  t     p  
xd6 x1 1 lc e xP d6 xP 1 1 lts !
D p C ; D p C ;
yd6 y1 t
1 ls e yP d6 yP 1 1 lc !
t

and renaming the velocities as u and v, scripted appropriately,


12 M. Tan et al.

   p    
uP d6 uP 1 1 lts !P 1 ltc !2
D p C C :
vP 6d vP 1 1 lc !
t
P 1 lts !2

Similarly for distal Segment 6 measured from proximal Segment 2.


The complete equations for non-heel contact are:

P D !;
 p  
xP 1 u1
p D ;
yP 1 v1
 d  
xP 6 u
D 6 ;
yP d6 v6

n 1 1 1 1 n n1 n n1 2 3
n ΠJ 0 0 0 0 Mpx Mdx M py Mdy  2 3 T
!P 6 7
1 Πlts 1 0 1 0 0t 0t 0t 0t  6 uP 1 7 6 ltc !2 7
6 7 6 t 2 7
1 Πltc 0 1 0 1 0t
0t 0t
0t  6 vP 7 6 l ! 7
6 17 6 s 7
.a/ 1 ŒŒ 0t 1 0 0 0 0t
0t 0t
0t  6 uP 7 6 0 7
6 67 6 7
.a/ 1 ŒŒ 0t 0 1 0 0 0t 0t 0t 0t  6 7 6 0 7
6 vP 6 7 D 6
6
7:
7
.b/ 1 ŒŒ 0t 0 0 1 0 0t
0t 0t
0t  6 px 7 6 0 7
6f 7 6 7
.b/ 1 ŒŒ 0t 0 0 0 1 0t
0t 0t
0t  6 dx 7 6 0 7
6f 7 6 7
n ΠJy m 0 0 0 I I 0 0  6 py 7 6 J !2 7
x
4f 5 6 7
n2 Π0 0 0 0 0 A Ad
p
0 0  6 0 7
f dy 6 y 2 7
n ΠJx 0 m 0 0 0 0 I I  4 J ! C mg 5
n2 Π0 0 0 0 0 0 0 Ap Ad  0

The above can be adjusted further to accommodate different cases such as the
single and double support phase.
The cases are broken down into:
• Case 1 (double-support phase) where Case 1A considers a scenario of two
external contacts at joint 2 and 7, and Case 1B considers a scenario of two
external contacts at joint 1 and 6
• Case 2 (free flight) considers a scenario where there is no contact with the ground
• Case 3 (single support phase with one contact at swing heel) where Case 3A
considers a scenario of one contact point at proximal Segment 2 and Case 3B
considers a scenario of one contact point at proximal Segment 6
• Case 4 (single support phase with one contact at stance toe) where Case 4A
considers a scenario of one contact point at proximal Segment 1 and Case 4B
considers a scenario of one contact point at proximal Segment 6.
These cases, showing how the matrices are modified depending whether the
weight is on the heel or toe of the foot can be found in the Appendix.
1 Analysing Human Walking Using Dynamic Optimisation 13

1.3 Methods and Procedures

This study applied the conventional inverse dynamics (Winter 2009) to obtain initial
estimates of joint torques which were used in dynamic optimisation. Dynamic
optimisation approach was adopted to compute the trajectory of joint torques and to
produce the observed motion. Since we were only concerned with the single support
phase and double support phase of walking and would be considering Segment 1 to
begin from stance foot, hence our focus would be on Case 4A for single support
phase and Case 1B for double support phase.
An alternative formulation of the conventional method of inverse dynamics was
used to determine the initial estimates of joint torques, and can be found in detail
later in this section. It is used so that a more realistic set of joint torques histories
is generated. Using torque histories specific to the movement pattern of the subject
improves the convergence of the optimisation (Chao and Rim 1973). The bounds for
each control are determined using the maximum and minimum estimates obtained
from the inverse dynamics approach. Since these estimates provided approximate
torque histories that were specific to the movement pattern, it would ensure that the
optimised torque trajectories are realistic.
For the present study, 18 states (x D Œx1 ; x2 ; : : : ; x18 > ), 15 system parameters
(z D Œz1 ; z2 ; : : : ; z15 > ), and 7 controls ( D Œ1 ; 2 ; : : : ; 7 > ) namely joint torques,
were set up in MISER3.3 (Jennings et al. 2000). The 18 states consist of the angular
displacements from Segment 1 to Segment 7 (xi D i ; i D 1; : : : ; 7), angular
velocity (xi D i ; i D 8; : : : ; 14), coordinate and velocity of proximal end of
p p p p
segment one, (.x15 ; x16 ; x17 ; x18 / D .x1 ; y1 ; xP 1 ; yP 1 /). The system parameter consists
of the initial segment angular orientation (i .0/ D zi ; i D 1; : : : ; 7), initial angular
velocity at start of single support phase (!i .0/ D z7Ci ) and z15 is the step length
p
which is twice the distance of initial distance of x1 .0/ and xd6 .0/ and hence dependent
on (z1 ; : : : ; z6 ). Placing tight bounds on the system parameters allows the initial
conditions to vary by a small amount about the initial values of the data.
The variables 1 ; 2 ; 3 describe the angular displacements of the stance leg
beginning with the foot, shank, and thigh segments respectively; 4 ; 5 ; 6 describe
the angular displacements of the swing leg from thigh, shank and foot segments
respectively and 7 describes the angular displacement of the trunk segment.
p p
!1 ; : : : ; !7 are the segments’ corresponding velocities; and (x1 ; y1 ) are the coor-
dinates of the proximal end of segment one (toe of stance foot) which remains
stationary on the ground during one step of the walk cycle.
There were two parts involved in the experiment to simulate normal walking and
obtaining more precise joint torque estimates. For the first part, forward dynamics
of seven segment model during the single support phase (Case 4A) was optimised
for the joint torques and initial values of  and ! such that computed .; z; t/
trajectories produced motion similar to normal walking. In doing so, the body had
to be kept upright and from falling under gravity. This was done by keeping the
y-coordinate CoM close to the initial y-coordinate CoM (at t D 0). Our initial
objective function is thus given by,
14 M. Tan et al.

Z T1
G0 .; z/ D .CoMypos  CoMyinit /2 dt
0

where T1 (D 0:386 s) is the duration of the single support phase, CoMypos is the
center of mass of y-coordinate, a function of .; z; t/, and CoMyinit is the initial
center of mass of y-coordinate, a function of z, as calculated by MISER3.3.
The objective function is subject to constraints in the canonical form:
Z tk 
D
Gk .u; z/ D k .x.tk ; z// C gk .t; x.t/; u.t/; z/ dt 0; k D 1; : : : ; ngc ;
0 

where ngc is the total number of canonical constraints, and tk 2 .0; tf  is a


known constant and is referred to as the ‘characteristic time’ associated with the
constraint Gk . All-time constraints h.t; x; u; z/  0 and constraints involving system
parameters gk .z/ as well, are converted by MISER3.3 to canonical constraints. See
Jennings et al. (2000) for more details.
The objective function is subject to the following constraints:
• Two terminal state equality constraints at the end of a single support phase (T1 D
0:386), to position the swing foot such that the swing toe is pivoting upwards and
the heel strikes the ground, marking the end of single support phase, these are
given by,

g1  0; 1 ..T1 /; z/ D a211 C a212 D 0;


5
X
where; a11 D 0:4794  li cos.i .T1 // C 0:1 cos.6 .T1 /  1:57/;
iD1

5
X
a12 D  li sin.i .T1 // C 0:1 sin.6 .T1 /  1:57/I
iD1

g2  0; 2 ..T1 /; z/ D a221 C a222 D 0;


6
X
where; a21 D 0:7039  li cos.i .T1 //;
iD1

6
X
a22 D 0:0751  li sin.i .T1 //:
iD1

Remark. The sum of squares of two constraints equal to zero was used to
reduce the total number of constraints and to allow more leeway in reducing
the individual constraints to zero.
1 Analysing Human Walking Using Dynamic Optimisation 15

• Two all-time constraints on the swing toe and swing heel, such that the swing
foot does not penetrate the ground, are given by,

6
X
3  0; h3 ..t/; z; t/ D li sin.i .t//  0; and
iD1
5
X
4  0; h4 ..t/; z; t/ D li sin.i .t// C 0:1 sin.6 .T1 /  1:57/  0:
iD1

• Two all-time constraints on knee, to prevent the knee from hyperextending, are
given by,

5  0; h5 ..t/; z; t/ D 3 .t/  2 .t/  0; and


6  0; h6 ..t/; z; t/ D 4 .t/  5 .t/  0:

Remark. There is a possibility that segmented rigid body models moving under
the effect of gravity can spin past a natural limit during optimisation computation
in the line search. Said et al. (2006) found that hyperextension modelling done
“automatically” in the dynamics, introduced a large force to be exerted to restore
the joint if it gets close to hyperextension. However, this was not used in this
research as it creates “stiff” differential equations.
• An all-time constraint on trunk, to prevent the trunk from falling forward or
backward, is given by,

7  0; h7 ..t/; z; t/ D 0:32112  .7 .t/  1:5376/2  0:

• Four all-time constraints on ankles, to prevent the ankle from hyperextending,


are given by,

8  0; h8 ..t/; z; t/ D 1 .t/  2 .t/  0;


9  0; h9 ..t/; z; t/ D 2 .t/  1 .t/ C 1:57  0;
10  0; h10 ..t/; z; t/ D 6 .t/  5 .t/  0; and
11  0; h11 ..t/; z; t/ D 5 .t/  6 .t/ C 1:57  0:

• Two constraints involving system parameters, to ensure stance foot is flat on the
ground and to position swing foot at time t D 0, is given by.

gz1 .z/ D l1 sin.z1 / C 0:1 sin.z1 C 1:57/ D 0; and


6
X
gz2 .z/ D 0:7647  li cos.zi / D 0:
iD1
16 M. Tan et al.

The optimisation process was usually CPU-time-consuming with numerous


failures along the way. At each failure, the constraints were checked to determine
if they were satisfied before the optimisation was restarted. It was noted that the
large number of ankle constraints made it difficult to obtain satisfactory results
hence a method proposed by Rehbock et al. (1996) was to add the constraint to
the objective function as a penalty on being negative. The new objective function
was thus given by,

1 .; z; t/  2 .; z; t/; if ™1  ™2 < 0;


ankle1 D
0; if ™1  ™2  0;
2 .; z; t/  1 .; z; t/ C 1:57; if ™2  ™1 C 1:57 < 0;
ankle2 D
0; if ™2  ™1 C 1:57:  0;

6 .; z; t/  5 .; z; t/; if ™6  ™5 < 0;


ankle3 D
0; if ™6  ™5  0;
5 .; z; t/  6 .; z; t/ C 1:57; if ™5  ™6 C 1:57 < 0;
ankle4 D
0; if ™5  ™6 C 1:57  0;

Z T1  
G0 .; z/D .CoMypos  CoMyinit /2 C 1;000.jankle1 j C jankle2 j C jankle3 j C jankle4 j/ dt
0

where (ankle1 ; ankle2 ; ankle3 ; ankle4 ) are the ankle constraints and only the con-
straints that are not satisfied are taken in the objective function. The final set of
controls from this optimisation was then used as the initial joint torques estimates
for the optimisation studies in the second part of the experiment.
In the second part of the experiment, the second phase of the walk cycle, double
support phase (Case1B), was incorporated with the first part for the rest of the time
interval (T D 0:486 s). Forward dynamics of seven segment model was optimised
now to simulate normal walking for both single support and double support phase
of walking. However, as the forward dynamics changed from single support phase
to double support phase, the collision of the foot with the ground resulted in jump
conditions on the model velocities (Hardt et al. 1999). The dynamics in the model
computed in MISER3.3 allowed the state to jump at particular times j , hence the
state equations have a form:
8 0
ˆ f 1 .t; x; u; z/; t 2 Œts ; 1 /; x.ts / D x .z/; 
ˆ
ˆ
< f 2 .t; x; u; z/; t 2 Œ1 ; 2 /; x.1 / D h1 .x. /; z/;
1
xP .t/ D ::
ˆ
ˆ :
:̂ 
f p .t; x; u; z/; t 2 Œp1 ; p /; x.p1 / D hp1 .x.p1 /; z/;

where x.j / are the new states at time j . As the ankle constraints were satisfied
in the first part of the experiment, they were removed from the objective function.
1 Analysing Human Walking Using Dynamic Optimisation 17

Thus, the objective function is now similar to the original function, given by,
Z Tf
G0 .; z/ D .CoMypos  CoMyinit /2 dt
0

with the following state equations:

f 1 .t; x; u; z/; t 2 Œ0; T1 /; x.0/ D x0 .z/;


xP .t/ D
f 2 .t; x; u; z/; t 2 ŒT1 ; Tf /; x.T1 / D h1 .x.T1 /; z/;

where T1 .D 0:386 s/ is the duration of the single support phase, Tf (D 0:486 s) is


the duration of a step (single support and double support phase) and h1 .x.T1 /; z/
defines the new states governing the start of double support phase. However, as
only the angular velocities experience jumps, angular displacements and proximal
Segment 1 position and velocity are defined as xi .T1 / D xi .T1 /; i D 1; : : : ; 7,
and i D 15; : : : ; 18, while xi .T1 /; i D 8; : : : ; 14 are newly defined. The objective
function is now subjected to similar all-time constraints as before except it applies
now to the full time interval from T D 0 to T D Tf .0:486 s/. It is however subject
to different terminal constraints:
• A terminal constraint at the end of single support phase (T1 D 0:386 s) to
determine heel-strike of swing foot, such that xd6 .Tf / is twice the step length of
p
x1 .0/ and xd6 .0/ at initial time 0, is given by,

g1  0; 1 ..T1 /; z/ D a211 C a212 D 0;


5
X
where; a11 D .z15  0:2368/  li cos.i .T1 // C 0:1 cos.6 .T1 /  1:57/;
iD1

5
X
a12 D  li sin.i .T1 // C 0:1 sin.6 .T1 /  1:57/:
iD1

Remark. Foot length was defined to be from heel to toe (D 0:2368 m) and
was estimated from segment parameters, Segment 1/6 and length of heel (from
ankle), since the foot segment was assumed to take on the shape of a right-angled
triangle.
• Two terminal constraints at the end of double support phase (Tf D 0:486 s), such
that the swing foot is on the ground and does not slide, are given by,

g2  0; 2 ..Tf /; z/ D b211 C b212 D 0


5
X
where; b11 D .z15  0:2368/  li cos.i .Tf // C 0:1 cos.6 .Tf /  1:57/;
iD1
18 M. Tan et al.

5
X
b12 D  li sin.i .Tf // C 0:1 sin.6 .Tf /  1:57/I
iD1

g3  0; 3 ..Tf /; z/ D b221 C b222 D 0


6
X
where; b21 D z15  li cos.i .Tf //;
iD1
6
X
b22 D  li sin.i .Tf //:
iD1

• A terminal constraint at the end of double support phase (Tf D 0:486 s) on


the trunk angular displacement, as in order to obtain periodicity of the walking
motion, the trunk itself must have a periodic motion, thus the constraint is given
by,

g4  0; 4 ..Tf /; z/ D z7  7 .Tf / D 0:

• Constraints on system parameters are similar as in single support phase, with a


slight change to the constraint that positions the swing foot which is given by,

6
X
gz2 .z/ D z15  li cos.zi / D 0:
iD1

1.3.1 Inverse Analysis

In order to solve for initial joint torques for dynamic optimisation, the conventional
method of inverse dynamics (Winter 1990) was adopted. The dynamics equation
in Case 4A and Case 1B were rearranged to solve for joint moments and reaction
forces from kinematic data of segments.
Rearranging Case 4A:

2 32 3 2 3
T M px Mdx Mpy Mdy  J!P
6 0 0 7 6 px 7 6 7
0 7 6 f 7 6 J !2  J ! P
x y
6 I I 7:
6 7 6 dx 7 6 7
6 0 Ap A d
0 0 76f 7 D 6 0 7
6 7 6 py 7 6 y 2 7
4 0 0 0 I I 5 4 f 5 4 J ! C mg C Jx !P5
0 0 0 Ap Ad f dy 0
1 Analysing Human Walking Using Dynamic Optimisation 19

The coefficient matrix is now a block upper triangular and  was obtained easily
at each discrete data time, ti (static inverse analysis), after solving for reaction
force vectors. As the matrix is square and invertible, a unique set of torques was
computed. This meant that for n segments, there should be n torques between
segments. However, this was not the same for Case 1B.
Even though the dynamic equations in Case 1B is a square matrix, when
rearranged, it becomes a non-invertible matrix with two rows of zeros.
Rearranging Case 1B:
2 32 3 2 3
T 0 0 M px Mdx Mpy Mdy  J!P
6 0t 0t 7 6 ex 7 6 t 2 7
0 0 0t 0t 0t 7 6 f6 7 6 4 lc !  4 ls ! P
t
6 7
6 0t 0 0 0t 0t 0t t 76 ey 7 6 t 2
0 7 6 f6 7 6 4 ls ! C 4 lc ! t
P 7
6 7
6 76 7 6 7:
6 0 0 0 I I 0 0 7 6 f px 7 D 6 J x !2  J y !P 7
6 7 6 dx 7 6 7
6 0 e5 0 Ap Ad 0 0 7 6 f 7 6 0 7
6 7 6 py 7 6 y 2 7
4 0 0 0 0 0 I I 5 4 f 5 4 J ! C mg C J ! x
P5
0 0 e5 0 0 Ap Ad f dy 0

As there were more unknowns than equations, the system is an underdetermined


system and a generalised inverse method was adopted to find a set of suitable
torques. This was done by considering the equations to be of the form Ax D y,
where we simply ignored the two zero row equation (the two entries in the RHS
should be zero), and compute

x D At .AAt /1 y DW AC y;

where AC is called the pseudo-inverse of A. The solution obtained was a unique


solution being the smallest normed solution from the infinite number of solutions
to an underdetermined system. However, we only required an initial estimate of the
joint torques so that a more realistic set of joint torques could be generated from
dynamic optimisation. If the external forces were known from force plate data, the
correct torques and forces could be uniquely computed.

1.4 Numerical Results

Experiment Main solved the problem by minimising the objective, G0 , to minimise


y-CoM displacement through the entire walk cycle subjected to constraints. These
constraints include four terminal state equality constraints, one at the end of a single
support phase on swing foot heel (1 ), and three at the end of double support phase
(2 ; 3 ; 4 ), two on swing foot and one on trunk angular displacement. Nine all-time
constraints (hi , where i D 5; : : : ; 13) on swing toe and heel, knees, trunk and ankles
and two system parameter constraints (gz1 ; gz2 ) to position stance foot and swing
foot at t D 0.
20 M. Tan et al.

Figure 1.4 shows the resulting stick diagram of a 2D human walking on level
ground, including the motion in single support phase and double support phase.
The algorithm produced 137 time points for the optimised simulation but for clarity,
only 21 were chosen at equal time intervals (t D 0:025 s) to depict the movement for
comparative purposes. A linear interpolation was done on the .x; y/ data for plotting
purposes to evenly distribute the intervals as the single support phase held a larger
percentage of the walk cycle time than the double support phase. Figure 1.5 plots

1.8

1.6

1.4

1.2

1
metres

0.8

0.6

0.4

0.2

0.2
1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1
meters

Fig. 1.4 Simulation of optimised walking motion – experiment main

1.8

1.6

1.4

1.2

1 0.486
red is swing
metres

0.8

0.6

0.4

0.2

0.2
1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1
meters

Fig. 1.5 Start and end of walk cycle – experiment main. Dotted figure is the initial configuration
with final swing toe and initial stationary toe at the same position
1 Analysing Human Walking Using Dynamic Optimisation 21

Table 1.1 Constraint values and corresponding  value


Constraints G1 G2 G3 G4
Values 4:56e8 6:60e7 2:92e7 4:37e7
Constraints G5 G6 G7 G8 G9 G10 G11 G12 G13
g .h/ 0 1:15e8 0 0 0 0 0 0 0
 value 1e6 9:86e7 1e6 1e6 1e6 1e6 1e6 1e6 1e6

the first and last position of the walk cycle, and showed that periodicity conditions
might have to be worked on for the last position to more accurately mirror the start
position.
As initial joint moment estimates derived from an alternate inverse dynamics
formulation were not very close to the real optimum, a large number of iterations
were involved. The computation was eventually able to come to a satisfactory
solution, satisfying constraint conditions. The final values of the canonical con-
straints are given in Table 1.1. G1 to G4 are defined as terminal time constraints
which were required to converge to 0. Their values were small enough to be
approximated to 0 and considered satisfied by MISER3.3. G5 to G13 are all time
constraints, and had to satisfy the algorithm. Constraint value of 0 occurs when
h   .D 106 / ) g .h/ D 0 and are satisfied. G6 constraint had value h as
h  , however it was still considered satisfied. The g value of G6 constraint not
being 0 could be explained from when swing heel struck the ground at t D 0:386 s to
t D 0:486 s, the y-position of swing heel had to be 0 or in this case approximately 0.
A closer look at the forces acting on the hip can be seen in Fig. 1.6. It was
observed that forces from the two thighs (Segment 3 and Segment 4) had to equate
to balance out the force from Segment 7. A larger force is required in the horizontal
direction from t D 0:386 s onwards as it prevented the body from continually
moving forward during double support phase.
External forces on swing ankle occurred at the start of double support phase
when swing heel struck the ground. During double support phase, the swing
ankle experienced three forces acting on it, namely distal and proximal force
from Segment 5 and Segment 6 respectively, and external force from the ground.
Figure 1.7 presents the forces acting on the ankle during double support. External
forces acting on the swing ankle were observed to be equal and opposite to the
proximal and distal forces in both x  y direction. It was also observed that most of
the forces were contributed from Segment 5 rather than Segment 6 mainly because
Segment 5 was carrying the main weight of the body while Segment 6 only had the
foot.
Figure 1.8 depicts the components of the vertical and horizontal forces on the
stance toe for a one step cycle. As no ground force plate was use in this experiment,
ground reaction force cannot be accurately predicted. However, a vertical reaction
force was observed and a 9th order polynomial fit was plotted against it which
had a familiar double-peak, also known as the “M-shaped” pattern. Distribution
of the model’s weight from the stance toe to the swing ankle could be seen at
22 M. Tan et al.

Forces (N) on hip


1500
1000
Horizontal force (N)

500
0
500
1000 x3d
1500 x4p
x7p
2000
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
Time(s)

600
400
Vertical force (N)

200
0
200
400 y3d
600 y4p
y7p
800
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
Time(s)
Student Version of MATLAB

Fig. 1.6 Forces on hip at distal and proximal end of x and y at segments 3, 4 and 7. (xd3 ; yd3 )
p p
denotes (x; y)-coordinate of distal segment 3, (x4 ; y4 ) denotes (x; y)-coordinate of proximal segment
p p
4, (x7 ; y7 ) denotes (x,y)-coordinate of proximal segment 7

Fig. 1.7 External forces External Force on swing ankle (N) during double support
1000
from angle to ground during fe6x
double support phase. 800
fe6y
(fe6x; fe6y) denotes external
forces acting on 600
(x; y)-coordinate of proximal
External Force (N)

segment 6 (ankle) 400

200

200

400
0.38 0.4 0.42 0.44 0.46 0.48 0.5
Time(s)
1 Analysing Human Walking Using Dynamic Optimisation 23

Force on stance toe (N) during single support


800

700

600

500
Force (N)

400

300 x1p
y1p
200 9th order polyfit
100

100
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
Time(s)

Force on stance toe (N) during double support


400
x1p
200 y1p

200
Force (N)

400

600

800

1000

1200

1400
0.38 0.4 0.42 0.44 0.46 0.48 0.5
Time(s)
p p p p
Fig. 1.8 Forces on stance toe (x1 ; y1 ). (x1 ; y1 ) denotes (x; y)-coordinate of proximal segment 1
(toe)
24 M. Tan et al.

Torque on segments, model 4 Torque on segments, model 1


100 1000

50 800
Torque (Nm)

Torque (Nm)
600
0
400
50
200
100
0

150 200
0 0.1 0.2 0.3 0.4 0.4 0.45 0.5

Forces on segments, model 4 Forces on segments, model 1


100 200

50 0
0
200
Forces (N)

Forces (N)

50
400
100
600
150

200 800

250 1000
0 0.1 0.2 0.3 0.4 0.4 0.45 0.5
Time(s) Time(s)

1 2 3 4 5 6 7

Fig. 1.9 Torques and forces acting on segments

heel-strike from end of single support phase, through to double support phase. A
high negative horizontal force kept the stance toe in position and prevented it from
moving especially during double support phase.
Figure 1.9 presents the torques and forces acting on segments to produce angular
acceleration. The torques and forces acting on each segment, including external
forces, were computed from the dynamics and plotted. It was observed that the
torques and forces on each segment for each phase were similar in pattern but were
opposite in direction to each other.
The final results of MISER3.3 and the data were plotted out and compared.
Figure 1.10 presents the .x; y/ center of mass (CoM) trajectories and velocities,
while Fig. 1.11 presents the segment angular displacements of MISER3.3 plotted
against the ones from the original data. Slight changes were observed between the
original angular displacement data and the optimised results, however the pattern
remains consistent. This was observed for the CoM trajectories and velocities as
1 Analysing Human Walking Using Dynamic Optimisation 25

CoM X CoM Y
Horizontal displacement (m)

0.4 1.18

Vertical displacement (m)


0.2 1.16

0 1.14

0.2 1.12

0.4 1.1
0 0.1 0.2 0.3 0.4 0.5 0 0.1 0.2 0.3 0.4 0.5

CoM X vel CoM Y vel


2 1
Horizontal velocity (ms )
1

1.5 Vertical velocity (ms )


1
0.5

1
0
0.5

0.5
0

0.5 1 MISER3
0 0.1 0.2 0.3 0.4 0.5 0 0.1 0.2 0.3 0.4 0.5
data
Time(s) Time(s)

Fig. 1.10 Optimised and data CoM displacement and velocity

well except for the sudden jump in velocities seen in the MISER3.3 data due to the
jump condition implemented when the heel struck the ground to account for abrupt
velocity change.
Comparisons between the results obtained by MISER3.3 and inverse analysis
were made as well. The optimised set of joint torques obtained from MISER3.3
is illustrated in Figs. 1.12 and 1.13, and plotted against the initial joint torques’
estimates. It was noted that the joint torques in MISER3.3 had to be made piecewise
constant for computation due to the jump condition occurring at t D 0:386 s.
Piecewise linear approximation was not possible as simulation could not converge
to .0:386 / D .0:386C/. Inverse analysis estimated joint torques for single
support and double support phases separately, while MISER3.3 considered the one
step walk cycle when solving for joint torques, thus requiring the jump condition.
As observed, though the joint torques required to reproduce the walking motion
in Fig. 1.4, as estimated by MISER3.3, were different to those obtained by the
conventional method of inverse dynamics, they however still followed a similar
pattern. Figure 1.14 illustrates the vertical reaction force, obtained using the results
of MISER3.3 and inverse analysis, on the stance toe during the single support phase.
A double peak pattern that was noticed in Fig. 1.13 was also observed when a 9th
order polynomial was fitted to data obtained by inverse analysis.
26 M. Tan et al.

3
Stance foot

1
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
Stance shank

0
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
Stance thigh

2 MISER3
data
1.5

1
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5

5.5
Swing thigh

4.5
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
Swing shank

2
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5

10
Swing foot

0
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5

2
HAT

1.5

1
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
Time (s)

Fig. 1.11 Optimised (MISER3.3) and data segment angular displacement (rad) of a one step cycle

1.5 Conclusions

In this paper, we have proposed a mathematical method to analyse human walking


behaviour using dynamic optimisation. A main advantage of the method developed
in the paper, is that it works well even for a complex movement that involves a
1 Analysing Human Walking Using Dynamic Optimisation 27

500
Stance toe

500
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
Stance ankle

500

500
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
Stance knee

200
MISER3
0 data

200
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

50
Swing knee

50
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
Swing ankle

10

10
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

200
Hip

200
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

100
Hip

100
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
Time(s)

Fig. 1.12 Joint torque trajectories of optimised (MISER3.3) and inverse dynamics (data) for single
support phase

change in the dynamics from single support phase to double support phase. The
overall research was able to simulate normal walking motion for a full walk cycle,
based on the model developed using MISER3.3.
28 M. Tan et al.

0
Stance toe

2000

4000
0.38 0.4 0.42 0.44 0.46 0.48 0.5
Stance ankle

2000

4000
0.38 0.4 0.42 0.44 0.46 0.48 0.5
Stance knee

2000

4000
0.38 0.4 0.42 0.44 0.46 0.48 0.5

500
Swing knee

MISER3
0 data

500
0.38 0.4 0.42 0.44 0.46 0.48 0.5
Swing ankle

1
0.38 0.4 0.42 0.44 0.46 0.48 0.5

2000
Hip

2000
0.38 0.4 0.42 0.44 0.46 0.48 0.5

1000
Hip

1000
0.38 0.4 0.42 0.44 0.46 0.48 0.5
Time(s)

Student Version of MATLAB

Fig. 1.13 Joint torque trajectories of optimised (MISER3.3) and inverse dynamics (data) for
double support phase
1 Analysing Human Walking Using Dynamic Optimisation 29

Fig. 1.14 Vertical reaction 900


force of optimised 800
(MISER3.3) and inverse
700
dynamics (data) during single
support phase 600

Force (N) on y1p


500

400
MISER3
300 9th order polyfit
200 data
9th order polyfit
100

100
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
Time(s)

Appendix A: Model Cases

1 Case 1A: Two External Contacts, at Joint 2 and Joint 7

The dynamic equations are:


2 3 2 3
P !
4 xP p 5 D 4 u1 5 ;
1
p
yP 1 v1

n 1 1 1 1 n1 n1 n1 n1


2 3 2 3
n ΠJ 0 0 0 0 Mpx Mdx Mpy Mdy  !P T
0t  6 7 6 t 2 7
1 Π0 0 0 0 0t 0t 0t 6 uP 1 7 6 3 lc !
t
3 ls 7
6 ex 7 6 t 2 7
1 Πt
1 ls 1 0 0 0 0 t
0t
0t
0  6 f2 7 6 1 lc !
t
7
1 Π3 ltc 0 0 0 0 0 t
0t
0t 6 7 6
0  6 vP 1 7 6 3 ls !
t t 2 7
7:
6 ey 7 D 6 t 2 7
1 Π1 ltc 0 0 1 0 0 t
0t
0t
0  6 f2 7 6 1 ls !
t
7
6 px 7 6 7
n ΠJy m 0 0 0 I I 0 0  6 f 7 6 J ! x 2 7
6 7
dx 7 6
6 7
n2 Π0 0 e1 0 0 A Ap d
0 6
0 6 f 7 6 0 7
py 5 4
7
n ΠJx 0 0 m 0 0 0 I I  f4 J ! C mg 5
y 2

n2 Π0 0 0 0 e1 0 0 Ap Ad  f dy 0

Note that this is a square matrix of order 5n. There are n C 2 second derivatives
and 4n  2 force components. Unfortunately the order 2n  1 zero-one blocks
on the diagonal are not invertible, so the usual block inversion cannot be fol-
lowed.
30 M. Tan et al.

2 Case 1B: Two External Contacts, at Joint 1 and Joint 6

The dynamic equations are:

P D !;
 p  
xP 1 u
p D 1 D 0;
yP 1 v1

n 1 1 n n2 n n2 2 3 2 3
n ΠJ 0 0 Mpx Mdx Mpy Mdy  !P T
1 Π4 lts 0 0 0t 0t 0t 0t  6 f ex 7 6  l t !2 7
6 6 7 6 4 c 7
1 Π4 ltc 0 0 0 t
0t 0t
0t  6 f ey 7 6  l t !2 7
6 6 7 6 4 s 7
6 px 7 D 6 7:
n ΠJy 0 0 I I 0 0  6 f 7 6 Jx !2 7
6 dx 7 6 7
n2 Π0 e5 0 Ap Ad 0 0  6 f 7 6 0 7
6 py 7 6 y 2 7
n ΠJ x 0 0 0 0 I I  4 f 5 4 J ! C mg 5
n2 Π0 0 e5 0 0 A Ad 
p
f dy 0

Note that this is a square matrix of order 5n  2. There are n second derivatives
and 4n  2 force components.

3 Case 2: No Contacts
px py dy
Now the forces at the joints 1 and 7, f1 , f1 , f6dx , f6 , are zero. The dynamic
equations are:

P D !;
 p 

xP 1 u1
p D ;
yP 1 v1
2 3
!P
n 1 n1 n2 1 n1 n2 6 uP 7 2 3
n ΠJ 0 Mpx Mdx 0 Mpy Mdy  6 17
6 f px 7 T
6 7 6 Jx !2 7
n ΠJy m I I 0 0 0  6 dx 7 6 7:
6f 7 D 6 7
n2 Π0 0 Ap Ad 0 0 0  6 7 6 0 7
6 vP1 7 6 y 2 7
n ΠJ x 0 0 0 m I I  6 py 7 4 J ! C mg 5
4f 5
n2 Π0 0 0 0 0 Ap Ad  0
f dy

This latter system has 5n  4 equations in n C 2 second derivatives and the 4n  6


force components are not necessarily zero. There are 2n C 4 differential equations
1 Analysing Human Walking Using Dynamic Optimisation 31

in total. If the position, velocity and acceleration of the distal point of Segment 6 are
needed, an extra four differential equations can be added with the four position and
velocity variables, but these can be computed from a knowledge of the variables
already in the system, that is they are dependent on the variables already in the
equations above.

4 Case 3A: One Contact at Swing Heel, Proximal Segment 2

The dynamic equations are:


2 3 2 3
P !
4 xP p 5 D 4 u1 5 ;
1
p
yP 1 v1

2 3
n 1 1 n1 1 n2 n1 1 n2 !P 2 3
n ΠJ 0 0 Mpx 0 Mdx Mpy 0 Mdy  6 uP 1 7 T
6 7
1 Π5 lts 1 0t 0t 0 0 0t 0 0t  6 vP 7 6  l t !2 7
6 17 6 5 c 7
1 Π5 ltc 0 1 0t 0 0 t
0t 0 0t  6 f px 7 6  l t !2 7
6 7 6 5 s 7
6 ex 7 D 6 7:
n ΠJy m 0 I 0 I 0 0 0  6 f2 7 6 Jx !2 7
6 dx 7 6 7
n2 Π0 0 0 Ap e1 Ad 0 0 0  6f 7 6 0 7
6 py 7 6 y 2 7
n ΠJx 0 m 0 0 0 I 0 I  6f 7 4 J ! C mg 5
6 ey 7
n2 Π0 0 0 0 0 0 Ap e1 Ad  4f 5 0
2
f dy

Note that this is a square matrix of order 5n  2. There are n C 2 second order
derivatives and 4n  4 force components.

5 Case 3B: One Contact at Swing Heel, Proximal Segment 6

The case presented here is similar to the one before except here the swing heel at
proximal Segment 6 is considered. The dynamic equations are:
2 3 2 3
P !
4 xP p 5 D 4 u1 5 ;
1
p
yP 1 v1
32 M. Tan et al.

2 3
n 1 1 n1 1 n2 n1 1 n2 !P 2 3
n ΠJ 0 0 Mpx 0 Mdx Mpy 0 Mdy  6 uP 1 7 T
6 7
1 Π4 lts 1 0t 0t 0 0 0t 0 0t  6 vP 7 6  l t !2 7
6 17 6 4 c 7
6 f px 7 6 7
1 Π4 ltc 0 1 0t 0 0 0t 0 0t  6 7 D 6 4 ls !2 7 :
t t
6 ex 7 6 7
n ΠJy m 0 I 0 I 0 0 0  6 f6 7 6 Jx !2 7
6 dx 7 6 7
n2 Π0 0 0 Ap e1 Ad 0 0 0  6f 7 6 0 7
6 py 7 6 y 2 7
n ΠJx 0 m 0 0 0 I 0 I  6f 7 4 J ! C mg 5
6 ey 7
n2 Π0 0 0 0 0 0 Ap e1 Ad  4f 5 0
6
f dy

Note that this case and the case before are similar, the only difference falls in the
numbering of segments. This system, as before, is a square matrix of order 5n  2.
There are n C 2 second derivatives and 4n  4 force components.

6 Case 4A: One Contact at Proximal Segment 1

The dynamic equations are:

P D !;

n n n2 n n2 2 3 2 3
n ΠJ Mpx M dx Mpy Mdy  ! P T
n ΠJy I I 0 0 6 px 7 6
6 f 7 D 6 J !2 7 :
x 7
6 dx 7 6 7
n2 Π0 Ap Ad 0 0  6f 7 6 0 7
6 py 7 6 y 2 7
n ΠJx 0 0 I I  4f 5 4 J ! C mg 5
n2 Π0 0 0 A Adp
 f dy 0

This gives n second order derivative variables and 4n  4 force components in


the latter equation.

7 Case 4B: One Contact at Distal Segment 6

The dynamic equations are:


2 3 2 3
P !
4 xP p 5 D 4 u1 5 ;
1
p
yP 1 v1
1 Analysing Human Walking Using Dynamic Optimisation 33

n 1 1 n1 n1 n1 n1 2 3 2 3


n ΠJ 0 0 Mpx Mdx Mpy Mdy  !P T
1 Π1 lts 1 0 0t 0t 0t 0t  6 uP 7 6  l t !2 7
6 17 6 1 c 7
6 vP 7 6 7
1 Π1 ltc 0 1 0 0 0 0t  6 1 7 D 6 1 ls !2 7 :
t t t t
6 px 7 6 x 2 7
n ΠJy m 0 I I 0 0  6f 7 6 J ! 7
6 dx 7 6 7
n2 Π0 0 0 A Ap d
0 0  6f 7 6 0 7
6 py 7 6 y 2 7
n ΠJ x 0 m 0 0 I I  4f 5 4 J ! C mg 5
n2 Π0 0 0 0 0 Ap Ad  f dy 0

The system has 5n  2 equations in n C 2 second derivatives and 4n  4 force


components. Note that Case 4A and Case 4B are similar, as they should be, only
difference being in the numbering of segments.

References

Alexander RM (1996) Walking and running. Math Gaz 80(488):262–266


Alexander RM (2003) Modelling approaches in biomechanics. Philos Trans Biol Sci
358(1437):1429–1435
Anderson FC, Pandy MG (2001) Dynamic optimization of human walking. J Biomech Eng
123:381–390
Bessonnet G, Chessé S, Sardain P (2004) Optimal gait synthesis of a seven-link planar biped. Int J
Robot Res 23:1059–1973
Bullimore SR, Burn JF (2006) Consequences of forward translation of the point of force
application for the mechanics of running. J Theor Biol 238(1):211–219
Cavagna GA, Thys H, Zamboni A (1976) The sources of external work in level walking and
running. J Physiol 262(3):639–657
Chao EY, Rim K (1973) Application of optimization principles in determining the applied moments
in human leg joints during gait. J Biomech 6:479–510
Chow CK, Jacobson DH (1971) Studies of human locomotion via optimal programming. Math
Biosci 10:239–306
Dean GA (1965) An analysis of the energy expenditure in level and grade walking. Ergonomics
8:31–47
Hardt M, Kreutz-Delgado K, Helton JW (1999) Optimal biped walking with a complete dynamical
model. In: Proceedings of the 38th conference on decision and control, Phoeniz, pp 2999–3004
Jennings LS, Fisher ME, Teo KL, Goh CJ (2000) MISER3 optimal control software (version 3):
theory and user manual. Centre of Applied Dynamics and Optimization, The University of
Western Australia
Koh MTH (2001) Optimal performance of the Yurchenko layout vault. PhD thesis, University of
Western Australia
Koh MTH, Jennings LS (2003) Dynamic optimization: inverse analysis for the Yurchenko layout
vault in women’s artistic gymnastics. J Biomech 36(8):1177–1183
Kuo A (1998) A least-squares approach to improving the precision of inverse dynamics computa-
tions. J Biomech Eng 120(1):148–159
Marshall RN (1985) Biomechanical performance criteria in normal and pathological walking. PhD
thesis, University of Western Australia
McGeer T (1988) Stability and control of two-dimensional biped walking. Technical report CSS-IS
TR 99-01, Centre for Systems Science, Simon Fraser University, Burnaby
34 M. Tan et al.

Onyshko S, Winter DA (1980) A mathematical model for the dynamics of human locomotion. J
Biomech 13(4):361–368
Pandy MG (2001) Computer modeling and simulation of human movement. Annu Rev Biomed
Eng 3:245–273
Rehbock V, Teo KL, Jennings LS (1996) Optimal and suboptimal feedback controls for a class of
nonlinear systems. Comput Math Appl 31(6):71–86
Ren L, Jones RK, Howard D (2007) Predictive modelling of human walking over a complete gait
cycle. J Biomech 40(7):1567–1574
Said M, Jennings LS, Koh MT (2006) Computational models satisfying relative angle constraints
for 2-dimensional segmented bodies. Anziam J 47:541–554
Selles RW, Bussmann JBJ, Wagenaar RC, Stam HJ (2001) Comparing predictive validity of four
ballistic swing phase models of human walking. J Biomech 34(9):1171–1177
Ünver NF, Tümer ST, Özgören MK (2000) Simulation of human gait using computed torque
control. Technol Health Care 8(1):53–66
Winter DA (1990) Biomechanics and motor control of human movement, 2nd edn. Wiley, Hoboken
Winter DA (2009) Biomechanics and motor control of human movement, 4th edn. Wiley, Hoboken
Xiang Y, Arora JS, Rahmatalla S, Abdel-Malek K (2009) Optimization-based dynamic human
walking prediction: one step formulation. Int J Numer Methods Eng 79(6):667–695
Xiang Y, Arora JS, Abdel-Malek K (2010) Physics-based modeling and simulation of human
walking: a review of optimization-based and other approaches. Struct Multidiscip Optim
42(1):1–23
Yamaguchi GT, Zajac FE (1990) Restoring unassisted natural gait to paraplegics via functional
neuromuscular stimulation: a computer simulation study. IEEE Trans Biomed Eng 37(9):886–
902
Chapter 2
Rearrangement Optimization Problems Related
to a Class of Elliptic Boundary Value Problems

Chong Qiu, Yisheng Huang, and Yuying Zhou

Abstract In this paper, we investigate two optimization problems related to a class


of elliptic boundary value problems on smooth bounded domains of RN . These
optimization problems are formulated as minimum and maximum problems related
to the rearrangements of given functions. Under some suitable assumptions, we
show that both problems are solvable. Moreover, we obtain a representation result
of the optimal solution for the minimization problem and show that this solution is
unique and symmetric if the domain is a ball centered at the origin.

Keywords Existence and uniqueness • Optimization • Eigenvalue • Rearrange-


ments

2.1 Introduction

After the Burton fundamental work (Burton 1987, 1989) on theory of rearrange-
ments, the rearrangement optimization problems in addressing questions such
as existence, uniqueness, symmetry and some qualitative properties of optimal
solutions have been investigated by a number of authors, see for example (Burton
1989; Kurata et al. 2004; Del Pezzo and Bonder 2009; Zivari-Rezapour 2013; Cuccu
et al. 2006a,b, 2009; Marras 2010; Marras et al. 2013; Emamizadeh and Zivari-
Rezapour 2007; Emamizadeh and Fernandes 2008; Emamizadeh and Prajapat 2009;
Chanillo et al. 2000; Chanillo and Kenig 2008; Nycander and Emamizadeh 2003;
Anedda 2011; Qiu et al. 2015) and the references therein.

This work was supported by Natural Science Foundation of China (11471235, 11171247,
11371273) and GIP of Jiangsu Province (CXZZ13_0792).
C. Qiu • Y. Huang • Y. Zhou ()
Department of Mathematics, Soochow University, Suzhou 215006, People’s Republic of China
e-mail: yuyingz@suda.edu.cn

© Springer-Verlag Berlin Heidelberg 2015 35


H. Xu et al. (eds.), Optimization Methods, Theory and Applications,
DOI 10.1007/978-3-662-47044-2_2
36 C. Qiu et al.

Let ˝ be a smooth bounded domain of RN .N  3/. We say that two measurable


functions f .x/ and g.x/ defined in ˝ are rearrangements of each other if

meas .fx 2 ˝ W g.x/  ag/ D meas .fx 2 ˝ W f .x/  ag/ ; 8a 2 R:

The rearrangement optimization problems related to the following eigenvalue


problem
(
u D h.x/u in ˝;
.Lh /
uD0 on @˝

or boundary value problem


(
u D f .x/ in ˝;
.Pf /
uD0 on @˝;

including their similar problems involving p-Laplacian, have been studied by some
authors, see for example (Burton 1987, 1989; Cuccu et al. 2006a, 2009; Marras
2010; Marras et al. 2013; Emamizadeh and Zivari-Rezapour 2007; Emamizadeh
and Fernandes 2008), where 0 < h 2 L1 .˝/, f 2 Lq .˝/ with q > 2N=.N C
2/.
Recently, a rearrangement optimization problem related to the following quasi-
linear elliptic boundary value problem has been considered in Qiu et al. (2015):
(
p u C h.x; u/ D f .x/ in ˝;
.P/
uD0 on @˝

where 1 < p < 1, h.x; t/ W ˝  R 7! R is a Carathéodory function satisfying


suitable growth conditions, f 2 Lq .˝/ with some 1  q < 1. In Qiu et al.
(2015), we showed that the minimum and maximum optimization problems related
to .P/ are solvable in both cases of 1 < p  N and p > N, which extended the
corresponding results in Burton (1987, 1989) with p D 2 and Cuccu et al. (2006a)
and Marras (2010) with 1 < p < 1.
In this paper, we will investigate two rearrangement optimization problems
related to the following elliptic boundary value problem:
(
u  h.x/u D f .x/ in ˝;
.Ph;f /
uD0 on @˝:

.Ph;f / is actually a model of the deformation problem for an elastic membrane made
out of some materials with prescribed quantities h, subject to a fixed vertical force
f . The usual goal is to identify a force function selected from R.f /, in such a way
2 Rearrangement Optimization Problems Related to a Class of Elliptic. . . 37

that the total displacement of the membrane is as small as possible. More precisely,
let I W H01 .˝/ ! R be the energy functional corresponding to the problem .Ph;f /,
which is given by
Z Z Z
1 2 2
I.u/ D jruj dx  hu dx  fudx; (2.1)
2 ˝ 2 ˝ ˝

and let 0 < h0 2 L1 .˝/ and f0 2 Lq .˝/ with q > 2N=.N C 2/ be two given
functions, then we will study the following minimum and maximum optimization
problems:
.Optm / Find hO 2 R.h0 /; fO 2 R.f0 / such that I.uhO ;fO / D infh2R.h0 /;f 2R.f0 / I.uh;f /
and
.OptM / Find fN 2 R.f0 / such that I.uh0 ;fN / D supf 2R.f0 / I.uh0 ;f /,
where R.h0 / and R.f0 / respectively denote the sets of all rearrangement of h0 and
f0 , uh;f (uh0 ;f ) is the unique solution of the problem .Ph;f / (.Ph0 ;f /) (the existence
and uniqueness of uh;f (uh0 ;f ) will be obtained in Propositions 2.1 of Sect. 2.3). We
will show that there exists  > 0 such that for all 2 .0;  /, both problems
.Optm / and .OptM / are solvable.
We note that the optimization problem considered in all the papers mentioned
above is constrained by a rearrangement set which is generated by just one fixed
function. The minimum optimization problem considered here is however con-
strained by two rearrangement sets generated by two fixed independent functions.
Moreover, Problem .Ph;f / contains .Lh / and .Pf / as special cases, the cost
functional used in our problem is more complicated than that used in the above two
problems, therefore our case needs special handling. We point out that an essential
assumption in Qiu et al. (2015) is that h.x; u/ being non-decreasing with respect to
the second variable u for almost all x 2 ˝. But in the present paper, since we assume
that 0 < and 0 < h.x/; a:e: x 2 ˝, the term  h.x/u in the problem .Ph;f /
would be decreasing with respect to u for almost all x 2 ˝ and then it violates
the essential assumption given in paper Qiu et al. (2015). So the conditions and
results for the maximization problem (OptM ) here are different from those obtained
in Qiu et al. (2015). To the best of our knowledge, the results obtained in this paper
are new.
This paper is organized as follows. In Sect. 2.2, we give some preliminaries. In
Sect. 2.3, we show that the problem .Ph;f / has a unique solution. Section 2.4 is
devoted to discuss the minimization problem (Optm ) in detail. Firstly, we prove that
the minimization problem (Optm ) is solvable in the case of 0 < <  , then we
obtain a representation result of the optimal solution for the minimization problem
and show that the problem (Optm ) has unique solution with some symmetric
properties if ˝ is a ball centered at the origin. In Sect. 2.5, we show that the
maximization problem (OptM ) is solvable.
38 C. Qiu et al.

2.2 Preliminaries

We denote by Lr .˝/ .1  r  1/ and H01 .˝/ the usual Sobolev spaces endowed
R 1=r
with the norms kukLr D ˝ jujr dx if 1  r < 1, kuk1 D ess supx2˝ ju.x/j and
R 2 1=2
kuk D ˝ jruj dx , respectively. Throughout the paper C will denote a positive
(possibly different) constant.
Definition 2.1. By a solution u of the problem .Ph;f / we mean that u 2 H01 .˝/
satisfying
Z
.rurv  huv  f v/ dx D 0; 8v 2 H01 .˝/:
˝

Let I be given in (2.1). If I 2 C1 .H01 .˝/; R/, then we have


Z
I 0 .u/v D .rurv  huv  f v/ dx; 8v 2 H01 .˝/:
˝

In this case, u 2 H01 .˝/ is a weak solution if and only if I 0 .u/v D 0, 8v 2 H01 .˝/.
The following lemmas will be used through the proofs of our main results.
Lemma 2.1 (Burton 1989, Lemma 2.1). Assume that 1  r < 1 and given f 2
Lr .˝/, then for any g 2 R.f / we have g 2 Lr .˝/ and kgkLr D kf kLr .
Lemma 2.2 (Burton 1989, Lemma 2.2). Assume that 1  r < 1 and given f 2
Lr .˝/, denote by R.f /w the weak closure of R.f / in Lr .˝/, then R.f /w is convex
and weakly compact in Lr .˝/.
Lemma 2.3 (Burton 1989, Lemma 2.9 or Cuccu et al. 2009, Lemma 2.1). Let
f ; g W ˝ 7! R be measurable functions and suppose that for each t 2 R, the level
set of g at t, i.e., fx 2 ˝ W g.x/ D tg, has zero measure. Then there exists an
increasing (decreasing) function ' such that ' ı g is a rearrangement of f where
' ı g denotes a composite function defined by

.' ı g/.x/ D '.g.x//; 8x 2 ˝:

Lemma 2.4 (Burton 1989, Lemma 2.4 or Cuccu et al. 2009, Lemma 2.2). For
any 1  r < 1 define r0 D r1 r
if r > 1 and r0 D 1 if r D 1. Let f 2 Lr .˝/ and
0
g 2 Lr .˝/. Suppose that there exists an increasing (decreasing) function ' W R 7!
R such that ' ı gR 2 R.f /. Then ' ı g is the unique maximizer (minimizer) of the
linear functional ˝ hgdx, relative to h 2 R.f /w .
Lemma 2.5 (Burton 1987, Theorem 5). For any 1  r < 1 define r0 D r1 r
if
0 0
r > 1 and r D 1 Rif r D 1. Let f 2 L .˝/ and g 2 L .˝/. Suppose that the linear
r r

functional L.l/ D ˝ lgdx has a unique maximizer (minimizer) fO relative to R.f /


then there exists an increasing (decreasing) function ' W R 7! R such that ' ıg D fO .
2 Rearrangement Optimization Problems Related to a Class of Elliptic. . . 39

Lemma 2.6 (Emamizadeh and Prajapat 2009, Lemma 2.3). Suppose that f 2
Lr .˝/ and g 2 LrR.˝/, then there exists fO 2 R.f / which maximizes (minimizes) the
0

linear functional ˝ hgdx, relative to h 2 R.f /w .


Lemma 2.7 (Leoni 2009, Theorem 16.9). Suppose that B is a ball centered at the
origin, then
Z Z
fgdx  f  g dx;
B B

for any non-negative measurable functions f and g, where f  and g are respectively
the Schwarz symmetric decreasing rearrangements of f and g, defined in the
following.
Definition 2.2 (Leoni 2009, Definition 16.5). Let f W ˝ 7! Œ0; 1/ be a measurable
function. The Schwarz symmetric decreasing rearrangement of f is the function
f  W B.0; r/ 7! Œ0; 1/, defined by
˚
f  .x/ D inf t 2 Œ0; 1/ W f .t/  !N jxjN ; 8x 2 B.0; r/

where !N denotes the volume of the unit ball in N-dimensions, r WD


.meas.˝/=!N /1=N and f W R 7! Œ0; 1/ is the distribution function of f defined by

f .t/ D meas.fx 2 ˝ W f .x/ > tg/:

It is well known that f  D g for each g 2 R.f /.


Lemma 2.8 (Leoni 2009, Theorem 16.10). Suppose that B is a ball centered at
the origin, u W B 7! Œ0; 1/ is a measurable function and  W Œ0; 1/ 7! Œ0; 1/ is a
Borel function, then
Z Z
 ı u dx   ı udx:
B B

The following result can be deduced from Lemmas 2.3 and 3.2 and Theorem 1.1 of
Brothers and Ziemer (1988).
1;p
Lemma 2.9. Suppose that B is a ball centered at the origin. If u 2 W0 .B/ with
1 < p < 1 and u  0 then u1 .˛; 1/ is a translation of u1 .˛; 1/ for every
˛ 2 Œ0; ess supx2B u.x// and
Z Z
jrujp dx  jru jp dx: (2.2)
B B

If the equality holds in (2.2) and the set


n o
x 2 B W ru.x/ D 0; 0 < u.x/ < ess sup u.y/
y2B

has zero measure, then u D u .


40 C. Qiu et al.

It is well known that the first eigenvalue 1 .h/ of the problem .Lh / can be
characterized by
R
jrvj2 dx
1 .h/ D inf R˝ : (2.3)
2
v2H01 .˝/;v6D0 ˝ hv dx

By Cuccu et al. (2009, Theorem 3.1), if 0 < h0 .x/ 2 L1 .˝/, then there exists
Nh 2 R.h0 / (the set of all rearrangements of h0 ) such that
R
jrvj2 dx
0<  N D
WD 1 .h/ inf 1 .h/ D inf inf R˝ : (2.4)
h2R.h0 / h2R.h0 / v2H 1 .˝/;v6D0 2
0 ˝ hv dx

2.3 Existence and Uniqueness for the Solution of the


Problem .Ph;f /

In this section, we will obtain the existence and uniqueness for the solution of the
problem .Ph;f /.
Proposition 2.1. Fix 0 < h.x/ 2 L1 .˝/, and f 2 Lq .˝/ with q > 2N=.N C2/ and
0 < < 1 .h/, where 1 .h/ is the first eigenvalue of the problem .Lh /. Then the
problem .Ph;f / has a unique solution uh;f 2 H01 .˝/ and I.uh;f / D infv2H 1 .˝/ I.v/.
0
Moreover, if in addition f .x/ > 0 a.e. x 2 ˝, then uh;f > 0.
Proof. First, we show that the problem .Ph;f / has a solution.
By the Hölder inequality and the Sobolev embedding inequality, we have
ˇZ ˇ
ˇ ˇ
ˇ fudxˇ  kf kL kukL 0  Ckuk (2.5)
ˇ ˇ q q
˝

for all u 2 H01 .˝/ since now 1 < q0 WD q=.q  1/ < 2 where 2 WD 2N=.N  2/.
Hence we deduce from (2.1), (2.3) and (2.5) that

1
I.u/  .1  /kuk2  Ckuk ! 1
2 1 .h/

as kuk ! 1, which shows that the functional I is coercive.


It is easy to see that the functional I is weakly lower semi-continuous (which
we will denote by w.l.s.c for short). So that the functional I has a minimizer uh;f 2
H01 .˝/ with I.uh;f / D infv2H 1 .˝/ I.v/. Using a standard argument (cf. Willem 1996,
0
Lemma 2.16), we can easily show that I 2 C1 .H01 .˝/; R/, therefore uh;f is a solution
of the problem .Ph;f / satisfying
2 Rearrangement Optimization Problems Related to a Class of Elliptic. . . 41

Z
I 0 .uh;f /v D ruh;f rv  huh;f v  f v dx D 0; 8v 2 H01 .˝/: (2.6)
˝

Next, we show that uh;f is the unique solution of the problem .Ph;f /.
Assume that wh;f 2 H01 .˝/ is another solution of the problem .Ph;f / and uh;f 6D
wh;f , then

kuh;f  wh;f k > 0:

By 0 < h.x/ 2 L1 .˝/, we have


Z
h.wh;f  uh;f /2 dx > 0: (2.7)
˝

From (2.6) and Definition 2.1 we get that for every v 2 H01 .˝/,
Z Z
ruh;f rv  huh;f v dx D f vdx;
˝ ˝
Z Z
rwh;f rv  hwh;f v dx D f vdx:
˝ ˝

Therefore,
Z Z
. hwh;f  huh;f /vdx D .rwh;f  ruh;f /rvdx; 8v 2 H01 .˝/: (2.8)
˝ ˝

Let v D wh;f  uh;f . Note that 0 < < 1 .h/, then from (2.3), (2.7) and (2.8) we
obtain
Z
.jrwh;f  ruh;f j2 /dx
˝
Z
D h.wh;f  uh;f /2 dx
˝
Z
< 1 .h/h.wh;f  uh;f /2 dx
˝
Z
 .jrwh;f  ruh;f j2 /dx;
˝

a contradiction. Therefore we have proved that uh;f is the unique solution of the
problem .Ph;f /.
42 C. Qiu et al.

Finally, if f .x/ > 0 then we can easily check that I.juh;f j/  I.uh;f /, which shows
that juh;f j is also a minimizer of I and thus a solution of the problem .Ph;f /. Then
uh;f D juh;f j  0 by the uniqueness of the solution. Since

uh;f .x/ D f .x/ C h.x/uh;f .x/ > 0; a:e: x 2 ˝;

we have uh;f .x/ > 0; a:e: x 2 ˝ (cf. Vázquez 1984, Theorem 5). t
u
Remark 2.1. In the case of D 1 .h/, if uh;f 2 H01 .˝/ is a solution of the problem
.Ph;f /, then for any t 2 R and v 2 H01 .˝/,
Z
r.uh;f C t'/rv  h.uh;f C t'/v  f v dx
˝
Z
D ruh;f rv  huh;f v  f v dx D 0;
˝

where ' is the eigenfunction of .Lh /. That is, uh;f Ct' is the solution of the problem
.Ph;f /. Therefore, in order to obtain the unique solution of the problem .Ph;f /, we
only consider the case of 0 < < 1 .h/ in the following.

2.4 Existence of Solution of the Problem .Optm /

Theorem 2.1. Suppose that 0 < h0 .x/ 2 L1 .˝/, f0 2 Lq .˝/ with q > 2N=.NC2/,
and 0 < <  , where  is given by (2.4). Then there exists hO 2 R.h0 /; fO 2 R.f0 /
which solves the problem .Optm /, i.e.,

I.Ou/ D inf I.uh;f /;


h2R.h0 /;f 2R.f0 /

where uO D uhO ;fO is the unique solution of .PhO ;fO /.


Proof. Clearly,  < 1 .h/, 8h 2 R.h0 /. By Proposition 2.1, the problem .Ph;f /
has a unique solution uh;f 2 H01 .˝/. Let

AD inf I.uh;f /
h2R.h0 /;f 2R.f0 /

then A is well-defined. Indeed, for each h 2 R.h0 /; f 2 R.f0 /, from (2.3) we


have
Z Z Z
2 2
 hu h;f dx  1 .h/ hu h;f dx  jruh;f j2 dx;
˝ ˝ ˝
2 Rearrangement Optimization Problems Related to a Class of Elliptic. . . 43

and then
Z Z Z
1 2
I.uh;f / D jruh;f j dx  hu2h;f dx  fuh;f dx
2 ˝ 2 ˝ ˝
(2.9)
1 2
 .1  /kuh;f k  Ckf kLq kuh;f k:
2 

By Lemma 2.1, kf kLq D kf0 kLq , we deduce that A must be finite.


Let f.hi ; fi /g be a minimizing sequence, i.e.,

hi 2 R.h0 / and fi 2 R.f0 /; 8i 2 N

and

A D lim I.ui /
i!1

where ui D uhi ;fi . It follows from (2.9) that fui g is bounded in H01 .˝/, then it
has a subsequence (still denoted fui g) which weakly converges to u 2 H01 .˝/
0
and strongly converges to u in Lq .˝/ with 1 < q0 D q=.q  1/ <

2 . Since kfi kLq  kf0 kLq , ffi g contain a subsequence (still denoted ffi g)
converging weakly to some fN 2 R.f0 /w , the weak closure of R.f0 / in Lq .˝/.
Then
ˇZ ˇ
ˇ ˇ
ˇ .fi  fN /udxˇ ! 0 as i ! 1
ˇ ˇ
˝
0
since u 2 Lq .˝/. It follows from the Hölder inequality that
ˇZ ˇ ˇZ ˇ ˇZ ˇ
ˇ ˇ ˇ ˇ ˇ ˇ
ˇ .fi ui  fN u/dxˇ  ˇ fi .ui  u/dxˇ C ˇ .fi  fN /udxˇ
ˇ ˇ ˇ ˇ ˇ ˇ
˝ ˝ ˝
ˇZ ˇ (2.10)
ˇ ˇ
 kfi kLq kui  ukLq0 C ˇˇ .fi  fN /udxˇˇ ! 0
˝

as i ! 1. Since khi k1  kh0 k1 , fhi g is bounded in L1 .˝/, it must contain


a subsequence (still denoted fhi g) converging weakly to some hN 2 R.h0 /w ,
the weak closure of R.h0 / in Lr .˝/.r > N=2/. Similarly as (2.10) we have
Z
lim N 2 /dx D 0:
.hi u2i  hu (2.11)
i!1 ˝

By (2.10) and (2.11) and the weak lower semi-continuity of the norm in the H01 .˝/,
we obtain that
Z Z Z
1 N 2 dx 
A D lim I.ui /  jruj2 dx  hu fN udx: (2.12)
i!1 2 ˝ 2 ˝ ˝
44 C. Qiu et al.

From Lemma 2.6 we R infer the existence of fO 2 R.f0 / which maximizes


the linear functional ˝ ludx, relative to l 2 R.f0 /w . As a consequence,
Z Z
fN udx  fO udx:
˝ ˝

RSimilarly we have there exists hO 2 R.h0 / which maximizes the linear functional
2
˝ lu dx, relative to l 2 R.h0 / . So that
w

Z Z
N 2 dx 
hu O 2 dx:
hu
˝ ˝

Combining with (2.12), we get


Z Z Z
1 O 2 dx 
A 2
jruj dx  hu fO udx: (2.13)
2 ˝ 2 ˝ ˝

By Proposition 2.1,
Z 
1 O 2 dx  fO v dx
I.Ou/ D inf jrvj2  hv
v2H01 .˝/ ˝ 2 2
Z Z Z (2.14)
1 O 2 dx 
 jruj2 dx  hu fO udx:
2 ˝ 2 ˝ ˝

It follows from (2.13) and (2.14) that I.Ou/  A.


On the other hand, recall that A D infh2R.h0 /;f 2R.f0 / I.uh;f /, we must have
A  I.Ou/. So that A D I.Ou/. t
u
O fO / for the
We now obtain a representation result of the optimal solution .h;
problem .Optm /.
Theorem 2.2. Under the assumptions of Theorem 2.1 and moreover suppose that
meas.fx 2 ˝ W f0 .x/ D 0g/ D 0. Then there exist increasing functions  and '
such that

hO D .Ou2 / a:e: in ˝;
(2.15)
fO D '.Ou/ a:e: in ˝;

where uO D uhO ;fO is the solution of .PhO ;fO /.


Proof. Since uO is the solution of .PhO ;fO /, I.Ou/  I.uh;fO /, 8h 2 R.h0 /. Therefore
Z Z Z Z Z Z
1 O u2 dx  1
2
jr uO j dx  hO fO uO dx  2
jr uO j dx  2
hOu dx  fO uO dx;
2 ˝ 2 ˝ ˝ 2 ˝ 2 ˝ ˝
2 Rearrangement Optimization Problems Related to a Class of Elliptic. . . 45

i.e.,
Z Z
2
hOu dx  O u2 dx;
hO 8h 2 R.h0 /:
˝ ˝

R
So that hO is a maximizer of the linear functional L.h/ WD ˝ hOu2 dx, relative to
h 2 R.h0 /.
We claim that hO is the unique maximizer of L.h/. If not, suppose that hN is another
maximizer of L.h/. Then
Z Z
O u2 dx D
hO N u2 dx:
hO
˝ ˝

Thus
Z Z Z
1 O u2 dx 
I.Ou/ D 2
jr uO j dx  hO fO uO dx
2 ˝ 2 ˝ ˝
Z Z Z
1 N u2 dx 
D jr uO j2 dx  hO fO uO dx
2 ˝ 2 ˝ ˝

 I.uhN ;fO /

 I.Ou/:

So that
Z Z Z
1 N u2 dx 
2
jr uO j dx  hO fO uO dx D I.uhN ;fO /:
2 ˝ 2 ˝ ˝

By the uniqueness of the minimizer of the functional I, we obtain uO D uhN ;fO . Then
Z Z Z
ruhN ;fO rvdx  N N O vdx D
hu fO vdx;
h;f
˝ ˝ ˝
Z Z Z
r uO rvdx  O uvdx D
hO fO vdx; 8v 2 H01 .˝/:
˝ ˝ ˝

So that
Z
.hN  h/O
O uvdx D 0; 8v 2 H01 .˝/;
˝

which implies that

N  h.x//O
.h.x/ O u.x/ D 0; a:e: x 2 ˝: (2.16)
46 C. Qiu et al.

By the assumption, meas.fx 2 ˝ W f0 .x/ D 0g/ D 0, we have meas.fx 2 ˝ W


fO .x/ D 0g/ D 0, since fO 2 R.f0 /. Thus meas.fx 2 ˝ W uO .x/ D 0g/ D 0.
N O a.e. x 2 ˝. Therefore, hO is the unique
Combining with (2.16) we have h.x/ RD h.x/,
2
maximizer of L.h/. Note that L.h/ D ˝ hOu dx, so by using Lemma 2.5, there exists
an increasing function  such that
O a:e: in ˝:
.Ou2 / D h;
O
RSimilarly, we can show that f is the unique maximizer of the linear functional l.f / WD
˝ f O
u dx, relative to f 2 R.f0 /. Also from Lemma 2.5, there exists an increasing
function ' such that

'.Ou/ D fO ; a:e: in ˝:

We complete the proof. t


u
Theorem 2.3. Under the assumptions of Theorem 2.1 and if ˝ is a ball centered
at the origin, f0 .x/ > 0, a.e. x 2 ˝, then the problem .Optm / has a unique solution
O fO / and hO D h , fO D f  where h (f  ) is the Schwarz symmetric decreasing
.h; 0 0 0 0
rearrangement of h0 (f0 ).
Proof. Denote by uO  the Schwarz symmetric decreasing rearrangement of uO , where
uO D uhO ;fO is the solution of .PhO ;fO /.
Similar to the proof of Theorem 4.5 in Qiu et al. (2015), we obtain
Z Z
jr uO  j2 dx D jr uO j2 dx (2.17)
˝ ˝

and
( )!
meas x 2 ˝ W r uO D 0; 0 < uO .x/ < ess sup uO .y/ D 0: (2.18)
y2˝

Now, by using Lemma 2.9, and noting (2.17) and (2.18), we see that uO D uO  .
By (2.15) in Theorem 2.2, hO D  ı .Ou /2 and fO D ' ı uO  are spherically symmetric
decreasing functions. It follows that hO coincides its Schwarz rearrangement, i.e.,
hO D hO  D h0 , so is fO . t
u

2.5 Existence of Solution of the Problem .OptM /

We now consider the problem .OptM /. Our results for the problem .OptM / are the
following.
Theorem 2.4. Let 0 < h0 .x/ 2 L1 .˝/, and let f0 2 Lq .˝/ with q > 2N=.N C 2/.
Suppose that 0 < < 1 .h0 /, where 1 .h0 / is given by (2.3) and f0 .x/  0, then
there exists a unique fN 2 R.f0 / which solves the problem .OptM /, i.e.,
2 Rearrangement Optimization Problems Related to a Class of Elliptic. . . 47

I.uh0 ;fN / D sup I.uh0 ;f /;


f 2R.f0 /

where uh0 ;f denotes the unique solution of the problem .Ph0 ;f /.


By using Proposition 2.1, we can define a functional ˆ W Lq .˝/ 7! R by

ˆ.f / D I.uh0 ;f /; (2.19)

where uh0 ;f denotes the unique solution of the problem .Ph0 ;f /.


Before proving Theorem 2.4, we shall show the following lemmas.
Lemma 2.10. Suppose that all the assumptions of Theorem 2.4 are satisfied.
Then
(I) The functional ˚jR.f0 /w is weakly continuous;
(II) The functional ˚jR.f0 /w is strictly concave;
(III) The functional ˚ is Gâteaux differentiable at each f 2 R.f0 /w with derivative
uf ,
where ˚ W Lq .˝/ 7! R is given by (2.19).
R 2
Proof. Since 0 < < 1 .h0 / and 1 .h0 / D infv2H 1 .˝/;v6D0 R˝ jrvj dx ,
h v 2 dx
0 ˝ 0

Z Z
1 .h0 / hu2h0 ;f dx  jruh0 ;f j2 dx:
˝ ˝

We get
1
I.uh0 ;f /  .1  /kuh0 ;f k2  Ckf kLq kuh0 ;f k:
2 1 .h0 /

The rest proof of (I), (II) and (III) is similar to the proof of Lemma 4.1 in Qiu
et al. (2015), we omit it. t
u
Similar to Lemma 4.2 in Qiu et al. (2015), we obtain
Lemma 2.11. Under the assumptions of Theorem 2.4, there exists a unique fQ 2
R.f0 /w which maximizes ˚jR.f0 /w . Moreover,
Z Z
uQ fQ dx  uQ gdx; 8g 2 R.f0 /w ; (2.20)
˝ ˝

where uQ D uQf .

Lemma 2.12. Let fQ and uQ be as in Lemma 2.11, and let S.fQ/ D fx 2 ˝ W fQ .x/ > 0g:
Set

D ess sup uQ .x/; ı D ess inf uQ .x/:


x2S.Qf / x2˝nS.Qf /

Then  ı.
48 C. Qiu et al.

Proof. If not, we assume that > ı. Then we can choose > 1 > 2 > ı. Since
> 1 , there exists a set A  S.fQ /, with positive measure, such that uQ  1 in A.
Similarly, there exists a set B  ˝ n S.fQ/, with positive measure, such that uQ  2
in B. Without lose of generality we may assume that meas.A/ D meas.B/. Then
there exists a measure preserving map T W A ! B: So that we can define a particular
rearrangement of fQ as following:
8
ˆ
ˆ fQ .Tx/; x2A
<
fN .x/ D fQ .T 1 x/; x 2 B
ˆ

fQ .x/; x 2 ˝ n .A [ B/:

Thus
Z Z Z Z
uQ fQ dx  uQ fN dx D uQ fQ dx  uQ fN dx
˝ ˝ A[B A[B
Z Z
D uQ fQ dx  uQ fN dx
A B
Z Z
 1 fQ dx  2 fN dx
A B
Z
D .1  2 / fQ dx > 0:
A
R R
Therefore, ˝ uQ fQ dx > ˝ uQ fN dx, a contradiction. t
u
Proof of Theorem 2.4. Let fQ and uQ be as in Lemma 2.11. It is clear that the level sets
of uQ , restricted to S.fQ/, have measure zero. Therefore applying Lemma 2.3, there
exists a decreasing function Q such that Q ı uQ is a rearrangement of fQ relative to the
set S.fQ/. Now, define
(
;
Q t ;
.t/ D
0; t> ;

where is given in Lemma 2.12. Then  is a decreasing function.


In the following, we will prove that  ı uQ is a rearrangement of fQ . By the definition
of the rearrangement, it is sufficient to prove

meas.f ı uQ .x/  ag/ D meas.ffQ.x/  ag/ (2.21)

holds for each a 2 R. Clearly,


[
f ı uQ .x/  ag D ffQ .x/ > 0;  ı uQ .x/  ag ffQ .x/ D 0;  ı uQ .x/  ag;
[ (2.22)
ffQ .x/  ag D ffQ .x/ > 0; fQ .x/  ag ffQ .x/ D 0; fQ .x/  ag;
2 Rearrangement Optimization Problems Related to a Class of Elliptic. . . 49

By the definition of ,
Q  and , we get

meas.ffQ.x/ > 0;  ı uQ .x/  ag/ D meas.ffQ.x/ > 0; Q ı uQ .x/  ag/


(2.23)
D meas.ffQ.x/ > 0; fQ .x/  ag/:

By (2.22) and (2.23), in order to prove (2.21) we only need to show that

meas.ffQ .x/ D 0;  ı uQ .x/  ag/ D meas.ffQ .x/ D 0; fQ .x/  ag/: (2.24)

By Lemma 2.12, we can deduce easily that

meas.ffQ .x/ D 0; uQ .x/ < g/ D 0:

Then the left side of the equality (2.24) can be rewritten as

meas.ffQ .x/ D 0;  ı uQ .x/  ag/ D meas.ffQ.x/ D 0; uQ .x/ > ;  ı uQ .x/  ag/


C meas.ffQ.x/ D 0; uQ .x/ D ;  ı uQ .x/  ag/:
(2.25)
By the definition of , we see that

ffQ.x/ D 0; uQ .x/ > ;  ı uQ .x/  ag D ffQ.x/ D 0; 0  ag D ffQ .x/ D 0; fQ .x/  ag:


(2.26)
Since

Qu.x/  h.x/Qu.x/ D fQ .x/; a:e: in ffQ .x/ D 0; uQ .x/ D > 0g;

meas.ffQ.x/ D 0; uQ .x/ D ;  ı uQ .x/  ag/ D 0: (2.27)

It follows from (2.25), (2.26) and (2.27) that (2.24) holds, and then (2.21) holds.
Therefore,  ı uQ is a rearrangement of fQ .
Hence, applying Lemma R 2.4, we can deduce that  ı uQ is the unique minimizer
of the linear functional ˝ gQudx, relative to g 2 R.f0 /w . This and (2.20) obviously
imply fQ D  ı uQ 2 R.f0 /. We complete the proof by choosing fN D fQ . t
u

Acknowledgements The authors would like to thank the referees for the valuable suggestions
which have improved the early version of the manuscript.

References

Anedda C (2011) Maximization and minimization in problems involving the bi-Laplacian. Annali
di Matematica 190:145–156
Brothers JE, Ziemer WP (1988) Minimal rearrangements of Sobolev functions. J Reine Angew
Math 384:153–179
50 C. Qiu et al.

Burton GR (1987) Rearrangements of functions, maximization of convex functionals and vortex


rings. Math Ann 276:225–253
Burton GR (1989) Variational problems on classes of rearrangements and multiple configurations
for steady vortices. Ann Inst Henri Poincaré 6:295–319
Chanillo S, Kenig C (2008) Weak uniqueness and partial regularity for the composite membrane
problem. J Eur Math Soc 10:705–737
Chanillo S, Grieser D, Kurata K (2000) The free boundary problem in the optimization of
composite membranes. Contemp Math 268:61–81
Cuccu F, Emamizadeh B, Porru G (2006a) Nonlinear elastic membrane involving the p-Laplacian
operator. Electron J Differ Equ 2006:1–10
Cuccu F, Emamizadeh B, Porru G (2006b) Optimization problems for an elastic plate. J Math Phys
47:1–12
Cuccu F, Emamizadeh B, Porru G (2009) Optimization of the first eigenvalue in problems involving
the p-Laplacian. Proc Am Math Soc 137:1677–1687
Del Pezzo LM, Bonder JF (2009) Some optimization problems for p-Laplacian type equations.
Appl Math Optim 59:365–381
Emamizadeh B, Fernandes RI (2008) Optimization of the principal eigenvalue of the one-
Dimensional Schrödinger operator. Electron J Differ Equ 2008:1–11
Emamizadeh B, Prajapat JV (2009) Symmetry in rearrangemet optimization problems. Electron J
Differ Equ 2009:1–10
Emamizadeh B, Zivari-Rezapour M (2007) Rearrangement optimization for some elliptic equa-
tions. J Optim Theory Appl 135:367–379
Kurata K, Shibata M, Sakamoto S (2004) Symmetry-breaking phenomena in an optimization
problem for some nonlinear elliptic equation. Appl Math Optim 50:259–278
Leoni G (2009) A first course in Sobolev spaces. Graduate studies in mathematics. American
Mathematical Society, Providence
Marras M (2010) Optimization in problems involving the p-Laplacian. Electron J Differ Equ
2010:1–10
Marras M, Porru G, Stella VP (2013) Optimization problems for eigenvalues of p-Laplace
equations. J Math Anal Appl 398:766–775
Nycander J, Emamizadeh B (2003) Variational problem for vortices attached to seamounts.
Nonlinear Anal 55:15–24
Qiu C, Huang YS, Zhou YY (2015) A class of rearrangement optimization problems involving the
p-Laplacian. Nonlinear Anal Theory Methods Appl 112:30–42
Vázquez JL (1984) A strong maximum principle for some quasilinear elliptic equations. Appl Math
Optim 12:191–202
Willem M (1996) Minimax theorems. Birkhauser, Basel
Zivari-Rezapour M (2013) Maximax rearrangement optimization related to a homogeneous
Dirichlet problem. Arab J Math (Springer) 2:427–433
Chapter 3
An Extension of the MOON2 /MOON2R
Approach to Many-Objective Optimization
Problems

Yoshiaki Shimizu

Abstract A multi-objective optimization (MUOP) method that supports agile and


flexible decision making to be able to handle complex and diverse decision envi-
ronments has been in high demand. This study proposes a general idea for solving
many-objective optimization (MAOP) problems by using the MOON2 or MOON2R
method. These MUOP methods rely on prior articulation in trade-off analysis among
conflicting objectives. Despite requiring only simple and relative responses, the
decision maker’s trade-off analysis becomes rather difficult in the case of MAOP
problems, in which the number of objective functions to be considered is larger
than in MUOP. To overcome this difficulty, we present a stepwise procedure that is
extensively used in the analytic hierarchy process. After that, the effectiveness of the
proposed method is verified by applying it to an actual problem. Finally, a general
discussion is presented to outline the direction of future work in this area.

Keywords Many-objective optimization • MOON2 • MOON2R • Pairwise com-


parison • AHP • Neural network

3.1 Introduction

A multi-objective optimization (MUOP) method that supports flexible and adaptive


decision making for application in complex, diverse, and competitive environments
has been in high demand. Notably, MUOP applies to problems involving incom-
mensurable objectives that conflict or compete with each other. Although Pareto
optimal solutions represent a rational norm in MUOP, there can be an infinite

This paper was presented at ICOTA 2013 held in Taipei., Taiwan.


Y. Shimizu ()
Department of Mechanical Engineering, Toyohashi University of Technology,
Toyohashi, Aichi 441-8580, Japan
e-mail: shimizu@me.tut.ac.jp

© Springer-Verlag Berlin Heidelberg 2015 51


H. Xu et al. (eds.), Optimization Methods, Theory and Applications,
DOI 10.1007/978-3-662-47044-2_3
52 Y. Shimizu

number of members of this class. The set of optimal solutions is known as the Pareto
front. Generally speaking, however, decision making as an engineering task aims at
obtaining a limited number of candidates for the final decision.
From this viewpoint, this study proposes a general idea for solving the many-
objective optimization (MAOP) problem in which more than several objective
functions are considered simultaneously. Effort is devoted to obtain a unique
solution known as the preferentially optimal solution or the best-compromise
solution. This approach is notably different from that of multi-objective evolutionary
algorithms (MOEA), which attempt to derive only the Pareto front (Coello 2001;
Czyzak and Jaszkiewicz 1998; Deb et al. 2000; Jaeggi et al. 2005; Robic and Filipic
2005). However, recent studies have revealed that even in MOEA, conventional
methods are not necessarily effective for dealing with MAOP problems (Hughes
2005; Sato et al. 2010).
In this context, we extend our previously proposed methods, named MOON2 and
MOON2R (Shimizu and Kawada 2002; Shimizu et al. 2004), to be able to handle
MAOP problems. Although MOON2 and MOON2R require only simple and relative
responses, handling the decision makers’ (DMs’) responses in trade-off analysis
becomes rather difficult in MAOP. To overcome this difficulty, this study proposes
an approach that is easily applicable to MAOP. Consequently, the proposed idea can
extend the applicability and practicality of existing methods to the complex decision
making environments mentioned above.
The rest of this chapter is organized as follows. In Sect. 3.2, the general
procedures of MOON2 and MOON2R are explained. Section 3.3 extends this
procedure to MAOP. In Sect. 3.4, the validity and effectiveness of the proposed
method is verified by applying it to an actual problem. A general discussion is also
presented in that section to give a definite and comprehensive outline of the direction
of future work in this area. A conclusion is given in Sect. 3.5.

3.2 MOON2 and MOON2R for MUOP and MAOP

General MUOP problems are described as follows.

.p: 1/ Min f .x/ D ff1 .x/ ; f2 .x/ ; : : : ; fN .x/g subject to x 2 X ;

where x denotes a decision variable vector; X, is a feasible region; and f is an


objective function vector, some elements of which are incommensurable and conflict
with one another. When n > 3, this problem is commonly referred to as a MAOP
problem. The abbreviation MOP is used below in cases where the distinction
between MUOP and MAOP is irrelevant.
As a particular characteristic of MOP, in addition to the mathematical procedures,
we need some information on the DM’s preference to obtain the best-compromise
solution as a final goal. The solution methods of MOP problems are generally classi-
fied as prior articulation methods or interactive methods (Shimizu 2010). Naturally,
3 An Extension of the MOON2 /MOON2R Approach to Many-Objective. . . 53

each of these conventional methods has both advantages and disadvantages. For
example, since in the former method a value function is derived separately from
the search process, the DM does not need to perform repeated interactions during
the search process, whereas such interactions are required in the latter method.
On the other hand, although the latter method allows for elaborate articulation of
attainability among the conflicting objectives, such articulation is difficult to obtain
with the former method. Consequently, the derived solution may sometimes differ
substantially from the best-compromise solution provided by the DM.
MOEA methods, which differ substantially from the two methods mentioned
above, have been developed recently. However, these methods require further steps
before attaining the final solution because the DM has to find the best solution
among a potentially large number of candidates scattered along the Pareto front. In
contrast, MOON2 and MOON2R can readily derive the best-compromise solution
while being free from the requirement of repeated responses during the search,
without giving up elaborate trade-off analysis. Therefore, MOON2 and MOON2R
are expected to serve as powerful tools for enabling flexible decision making in
agile engineering under diverse customer requirements.
Because MOON2 and MOON2R belong to the prior articulation methods in MOP,
they have to identify the value function of the DM in advance. Such modeling can
be performed with a suitable artificial neural network to deal with the non-linearity
commonly seen in the value function. A back-propagation network (BPN) is used
in MOON2 , while MOON2R employs a radial-basis function network (RBFN).
To train the neural network, training data representing the preferences of the
DM should be gathered by an appropriate means. These methods use pairwise
comparison among the appropriate trial solutions, which are spread over the search
area in the objective-function space. It is natural to constrain this modeling space
to within the convex hull enclosed by the utopia and nadir solutions, which are
defined as f * D (f1 (x utop ), f2 (x utop ), : : : , fN (x utop ))T and f* D (f1 (x nad ), f2 (x nad ),
: : : , fN (x nad )) T , respectively, where x utop and x nad are the respective utopia and
nadir solutions in the decision variable space.
Then, the DM is asked to indicate the preferred solution and the spacing
between each pair of trial solutions, for example, f i D f(xi ) and f j D f(x j ), xi , x j 2X.
These responses are provided in the form of linguistic statements, which are later
transformed into scores denoted as aij (Table 3.1), similarly to the analytic hierarchy
process (AHP) (Saaty 1980). For example, when the answer is such that f i is
strongly preferable to f j , aij takes a value of 5 (Table 3.1).

Table 3.1 Conversion table Linguistic statement aij


for linguistic statements
Equally 1
Moderately 3
Strongly 5
Demonstrably 7
Extremely 9
Intermediate values between adjacent statements 2, 4, 6, 8
54 Y. Shimizu

Fig. 3.1 Learning process


using RBFN

Table 3.2 Pairwise


comparison f1 f2 f3 fk
f1 1
f2 1
f3 1

aij = 1
a ij
fk 1

By performing such pairwise comparisons over k trial solutions, we can obtain


a pairwise comparison matrix (PCM) (Table 3.2). Element aij represents the degree
of preference of f j compared to f i . Note that although aij is defined as the ratio of
relative degrees of preference, it does not necessarily mean that f i is aij times more
preferable to f j . According to the same conditions as AHP, such that aii D1 and aji
D1/aij , the DM is required to provide k(k-1)/2 responses for the pairs highlighted in
Table 3.2. Under these conditions, it is also easy to examine the consistency of such
pairwise comparisons from the consistency index CI used in AHP.
Since information on the preferences of the DM is embedded in the PCM,
we can derive a value function based on it. However, in general, it is almost
impossible to give a mathematically definite form of the value function, as it is
likely to be highly nonlinear. Unstructured modeling techniques that use neural
networks are suitable for modeling in such situations. All objective values of each
pair f i and f j (8i, j2f1, 2, : : : , kg) are used as 2 N inputs of the neural network,
and aij is the single output. Hence, PCM provides a total of k2 training data sets
for the neural network. Eventually, the trained neural network can be viewed as
an implicit function mapping the 2 N dimensional space to the scalar space (i.e.,
VNN W f i ; f j 2 R2N ! aij 2 R).
Next, looking at the relations in Eq. (3.1), we can easily compare the preferences
for any pair of solutions. Therefore, by fixing one of the input vectors of the neural
network at an appropriate reference vector f R , we can evaluate any solution from
the output of the neural network (Eq. (3.2)). In other words, VNN can serve as a
value function. We can nominate some candidates for the reference point f R , such
3 An Extension of the MOON2 /MOON2R Approach to Many-Objective. . . 55

as utopia, nadir, a center of gravity between them, or the point where the total sum
of distances from all trial points is a minimum.

VNN f i ; f k D aik > VNN f j ; f k D ajk () f i  f j (3.1)

VNN f .x/ ; f R D axR > VNN f .y/ ; f R D ayR () f .x/  f .y/; 8x; y 2 X
(3.2)

Once the value function is identified, the original MOP problem is transformed
into an ordinal single-objective problem.

.p:2/ Max VNN f .x/; f R subject to x 2 X

Because the value function is built separately from the search process, a DM
can carry out trade-off analyses whatever pace is desired without having to provide
immediate responses or wait for queries, as is often required in interactive methods.
In addition, because the required responses are simple and relative, the load on the
DM in such interaction is rather small. These are some of the notable advantages
of this approach. Moreover, the following proposition supports the validity of the
above formulation.
[Proposition] The optimal solution of Problem (p. 2) is a Pareto optimal solution
of Problem (p. 1) if the value function is chosen so as to satisfy the relation given
by Eq. (3.1).
(Proof) Let b fi , (i D 1, : : : , N) be the values of the objective functions for the
optimal solution b x of Problem (p. 2), so that b fi D fi bx . Here, let us assume for

contradiction that b f is not a Pareto optimal solution. Then there exists a certain
f 0 such that for 9j, fj 0 < b fj  fj ; fj > 0 ) and fj 0  b fi , (i D 1, : : : , N,

i ¤ j).
 Because  the DM apparently prefers f 0 to b
f , it holds that VNN f 0 ; f R >
 
VNN b f ; f R . This contradicts that b f is the optimal solution of Problem (p. 2).

Hence, bf must be a Pareto optimal solution.
Once x is given, we can readily evaluate any candidate solution through VNN .
Hence, it is possible to choose the most appropriate method from among a variety of
conventional single-objective optimization methods. In addition to direct methods,
meta-heuristic methods such as genetic algorithms, simulated annealing and tabu
search are also applicable. At the same time, it is almost impossible to apply any of
the interactive methods of MOP due to the large number of interactions during the
search, which are likely to make the DM rather careless in providing responses.
When this approach is applied with an algorithm that requires the gradients of
the objective function, such as nonlinear programming, we need to obtain these
gradients by numeric differentiation. The derivative of the value function with
respect to a decision variable is calculated by using the following chain rule.
56 Y. Shimizu

Fig. 3.2 Flowchart of the


Start
proposed method

Set utopia/nadir
& Searching space

Generate trial sols.

Perform pair comparisons

No
Consistent
?
Yes

Limit the space Identify VNN by NN

Select Optimization Method

No
Need gradients
?
Yes
Incorporate Numerical differentiation

Apply Optimization algorithm

No Satisfactory
?
Yes
END

! 
@VNN f .x/ ; f R @VNN f .x/ ; f R @f .x/
D (3.3)
@x @f .x/ @x

The derivative can be calculated from the analytic form of the second part in the
right-hand side of Eq. (3.3) and the following numeric differentiation. Since most
nonlinear programming software supports numeric differentiation, the algorithm can
be realized without any special concerns.

@VNN VNN . ;fi .x/C fi ; I f R /VNN . ;fi .x/; I f R /
@fi
Š fi
(3.4)

The proposed procedure can be summarized as follows (Fig. 3.2).


Step 1: Generate several trial solutions in the objective-function space.
3 An Extension of the MOON2 /MOON2R Approach to Many-Objective. . . 57

Step 2: Extract the preferences of the DM through pairwise comparison between


every pair of trial solutions.
Step 3: Train the neural network with the preference information obtained from
the above responses. This network serves as a value function VNN by selecting a
certain reference solution f R .
Step 4: Apply an appropriate optimization method to solve the resulting Prob-
lem (p. 2).
Step 5: If the DM is not satisfied with the result obtained in the above process, limit
the search space around that result and repeat the same procedure until he or she
accepts the result.

3.3 Procedure for MAOP

Because the aforementioned methods are natural and easy to work with for value
assessments by humans, we have applied them to various problems and confirmed
their effectiveness (Shimizu et al. 2005, 2006, 2010, 2012a; Shimizu and Tanaka
2003; Shimizu and Nomachi 2008). However, the case of MAOP is different if we
consider the limit to the abilities of humans to perform assessment. As the number
of objective functions increases, the difficulty of such value assessment through
pairwise comparison increases rapidly. For example, suppose that a customer
intends to buy a ticket for transportation in a certain situation. It seems rather easy
to choose between a pair of candidates if they are evaluated on only two objectives,
such as travel time and expense. According to the procedure outlined above, in this
case, the customer has to make a pairwise comparison between the pair of solutions
(i, j) in terms of the objectives (time i, cost i) and (time j, cost j), respectively.
However, what will happen if there are more objectives to be compared? Suppose
that the customer has to compare a pair of candidates in terms of four objectives:
time, cost, service and comfort. Undoubtedly, the difficulty of assessment will grow
substantially, and the customer may often give up on the comparison altogether,
except in special cases.
For MAOP, therefore, it is impractical to deploy the proposed idea while
maintaining the portability of the previous method. The basic idea of the proposed
procedure involves replacing the pairwise comparison on many objectives with a
comparison on a scalar objective. Assuming independence of the objective functions
of (p.1), this procedure can be realized by the following steps.
Step 1: Determine the relative importance
X among the objective functions as weights
wk ; .k D 1; : : : ; N/ ; such that wk D 1, through pairwise comparison and
k
eigenvalue calculation, as in AHP. Repeat if the pairwise comparison fails the
consistency test.
Step 2: Narrowing the focus to the kth objective function only, ask the DM to give
a preference for every pair of trial solutions f i D f xi and f j D f x j , ffk (xi ),
58 Y. Shimizu

fk (x j )g .8i; j; i > j/, and obtain the preference intensity sik ; 8i by calculating
the eigenvalues of this PCM. Repeat this process for every objective function.
X
N
Step 3: Calculate the total preference of the ith trial as Si D wk sik ; 8i.
kD1
Step 4: Finally, calculate aij , which is the PCM element corresponding to the
preference between f i and f j , as aij D Si =Sj .
Step 5: Similarly to the previous step, identify the value function of the DM from fi
and f j as the inputs and aij as the output of the neural network.
The above procedure can be easily implemented by a DM who is familiar with
AHP, and does not introduce additional complexity to the original procedures of
MOON2 and MOON2R .

3.4 Case Study

3.4.1 Evaluation Method

To verify the feasibility of our approach, we applied it to a problem assuming a


virtual DM whose value function is given by Eq. (3.5) as a reference. We compared
the result obtained by the proposed method with that from the optimization problem
by using the following comprehensive objective function of (p.1):

( 1.
X
N t ) t
fk .x/  fk
U .f .x// D wk ; (3.5)
k
fk  fk

where fk and fk denote the utopia and nadir values of the kth objective function,
respectively. Moreover, wk and t are a weight representing relative importance and
a norm parameter, respectively. Hence, U( f(x)) represents the attainability ratio for
utopia and takes a value of 1.0 for the utopia and 0.0 for the nadir.
We carried out the experiment along with the following procedures that corre-
spond to those in Sect. 3.3.
P
Step 1: Determine a set of weights wk . wk D 1/, each of which stands for
the relative importance of the corresponding objective function regarding the
preference.
Step 2: Instead of interactive pairwise comparison, for the virtual DM, obtain the
 i t
f f
preference index sik of the ith trial for the kth objective as sik D f k fk ,
k k
(8k; 8i).
.
!1
XN t
Step 3: Calculate the total preference score Si as Si D wk sik , .8i/.
k
3 An Extension of the MOON2 /MOON2R Approach to Many-Objective. . . 59

Fig. 3.3 Welded beam


design P
h
t

l L
b

Step 4: Obtain the ijth element of the PCM as aij D Si =Sj


Step 5: By using the data obtained above, train the neural network so that the relation
VNN f i ; f j D aij ; .8i; j/ is satisfied. Then, select an appropriate reference
solution f R .
Step 6: From the above steps, make Problem (p. 2) definite and solve it by an
appropriate ordinal optimization method.
Step 7: Compare the above result with that of another optimization problem, such
as Max(Eq. (3.5)) subject to x 2 X.

3.4.2 Welded Beam Design Problem

We considered a welded beam design problem (Fig. 3.3) and described it as a


four-objective optimization Problem (p. 3). This is originally studied in (Erfani and
Utyuzhnikov 2012) as a bi-objective optimization problem.
(p. 3) Min ff1 , f2 , f3 , f4 g subject to Eqs. (3.6), (3.7), (3.8), (3.9), (3.10), (3.11),
(3.12), (3.13), (3.14), (3.15), (3.16), and (3.17–3.20)

3.4.2.1 Objective Functions

f1 WD 1:105h2l C 0:048tb .L C l/ ! min.Cost/ (3.6)

4PL3
f2 WD ı D ! min .Deflection/ (3.7)
Et3 b
r
l
f3 WD  D  02 C  0  00 C  00 2 ! min .Shear stress/ (3.8)
R
6PL
f4 WD D ! min .Bending stress/ (3.9)
t2 b
60 Y. Shimizu

3.4.2.2 Constraints

hb (3.10)

Pc  P (3.11)

Pc D 64746:02 .1  0:3t/ tb3 (3.12)

P
0 D p (3.13)
2hl

 00 D P .L C 0:5l/ R=J (3.14)


r 
RD 0:25 l2 C .h C t/2 (3.15)
!
p l2 .h C t/2
J D 2hl C (3.16)
12 4

0:125  b  5; 0:1  t  10; 0:1  l  10; 0:125  h  5; (3.17–3.20)

3.4.2.3 Decision Variables

h [m]: welding thickness; l [m]: welding length; t [m]: beam width; b [m]: beam
thickness

3.4.2.4 Parameters

P D 6000:0 Œlb ; L D 14:0 Œin ; E D 3:0 E8 Œpsi

3.4.3 Numerical Results

First, we described the objective tree as shown in Fig. 3.4. Then, we generated six
random trials within the hyper-rectangular space enclosed by the utopia and nadir,
which are shown in Table 3.3 together with the test trials. Next, we set the weights
3 An Extension of the MOON2 /MOON2R Approach to Many-Objective. . . 61

Beam design

Cost Deflection Shear stress Bending stress

Design1 Design 2 Design 3 Design 4 Design 5 Design6

Fig. 3.4 Hierarchy of evaluation factors

Table 3.3 Specification of each trial with utopia and nadir


Design 1 Design 2 Design 3 Design 4 Design 5 Design 6 Utopia Nadir
Cost 20.00 9.27 16.49 11.52 13.75 11.13 5.00 20.00
Deflection 4.11 E-03 5.56 E-03 6.66 E-03 6.82 E-03 2.32 E-03 3.80 E-03 1.00 8.00
E-03 E-03
Shear 7281.69 8071.22 7333.68 5421.39 12214.13 9246.70 3200.00 13600.00
stress
Bending 29256.52 19066.19 18203.62 22550.54 29709.86 27521.33 15000.00 30000.00
stress

Table 3.4 PCM (t D 1)


Design 1 Design 2 Design 3 Design 4 Design 5 Design 6
Design 1 1 0.51 0.84 0.61 0.67 0.57
Design 2 1.95 1 1.63 1.18 1.30 1.10
Design 3 1.20 0.61 1 0.72 0.80 0.68
Design 4 1.65 0.85 1.38 1 1.10 0.94
Design 5 1.50 0.77 1.25 0.91 1 0.85
Design 6 1.76 0.91 1.48 1.07 1.18 1

representing the relative importance as w D (0.4, 0.3, 0.2, 0.1), which are the same
as those given for the reference value function in Eq. (3.5). Then, the preference
intensity of every trial with respect to each objective function was derived from the
formula given in Step 2. Finally, the total preference was calculated as S D (0.293,
0.570, 0.350, 0.484, 0.439, 0.517) for t D 1. In Step 4, Si /Sj was calculated to derive
the elements of the PCM shown in Table 3.4. Based on that procedure, we built the
value function VNN (f(x); f R ) of the neural network.
Letting f R D f  , we solved (p. 3) under this value function by the modified
nonlinear simplex method (Nelder and Mead 1965) so that it can accommodate the
constraints. In Table 3.5, the result is compared with that obtained by optimizing
Problem (p. 3) under the objective function in Eq. (3.5). This problem is solved by
using the commercial software package LINGO (Ver. 13.0).
62 Y. Shimizu

Table 3.5 Results of MUOP (Independent: t D 1)


Decision variable Objective function value
l t b h Cost Deflection Shear stress Bending stress
This work 2.540 3.328 2.650 1.135 10.614 2.249E-03 11262.08 17179.24
LINGO 2.982 3.329 2.789 1.154 11.955 2.134E-03 9657.292 16307.03
Gap [%] 14.82 0.03 4.98 1.65 11.22 5.39 16.62 5.35
(Input layer: 8 neurons; hidden layer: 10 neurons; learning rate: 0.5; momentum: 0.1; RSME:
3.33  104 )
Gap D j This work  LINGO j /LINGO  100

Table 3.6 PCM (Independent: t D 2)


Design 1 Design 2 Design 3 Design 4 Design 5 Design 6
Design 1 1 0.69 1.00 0.77 0.79 0.76
Design 2 1.45 1 1.45 1.11 1.14 1.11
Design 3 1.00 0.69 1 0.77 0.79 0.77
Design 4 1.31 0.90 1.30 1 1.03 1.00
Design 5 1.27 0.88 1.27 0.98 1 0.97
Design 6 1.31 0.90 1.31 1.00 1.03 1

Table 3.7 Result of MUOP (Independent: t D 2)


Decision variable Objective function value
l t b h Cost Deflection Shear stress Bending stress
This work 1.857 3.329 2.736 1.149 9.637 2.176E-03 14463.65 16628.19
LINGO 2.118 3.330 3.000 1.102 10.573 1.982E-03 13600.00 15151.22
Gap [%] 12.32 0.03 8.80 4.27 8.85 9.79 6.35 9.75
(Input layer: 8 neurons; hidden layer: 10 neurons; learning rate: 0.5; momentum: 0.1; RSME:
3.60  104 )

In a similar manner, we had S D (0.408, 0.592, 0.409, 0.534, 0.520, 0.535) for
t D 2 and obtained the results in Tables 3.6 and 3.7. Close correspondence can be
observed between the results, with a few exceptions.

3.4.4 Discussion

A definite basis for evaluating subjective decisions in a well-defined manner that


is acceptable to everyone does not exist. This fact causes considerable difficulty
when attempting to perform a general evaluation to obtain the best-compromise
solution found by the mathematical process of MOP. As it often happens, what one
person considers the best compromise may not be acceptable to others since each
DM has a different value system. We are confident that the procedure outlined here
is applicable in such situations since the final preference is evaluated on the basis of
3 An Extension of the MOON2 /MOON2R Approach to Many-Objective. . . 63

Table 3.8 Comparison of t This work LINGO Gap [%]


value function values
1 0.627 0.633 0.92
2 0.692 0.691 0.18

a b

Fig. 3.5 Post-optimal analysis in terms of MUOP. (a) Result of elite-induced MOEA. (b) Result
obtained with "-constraint method

an implicitly embedded value function, such as Eq. (3.5). This is also a basic norm
of utility theory (Fishburn 1970).
Although some results in Tables 3.5 and 3.7 seem to be somewhat far from the
reference solution, we can account for this weakness if we compare the results
in terms of the above aspects. Both results in Table 3.8 are so similar that the
DM cannot distinguish between them. Moreover, we confirmed that the best-
compromise solution could not be outperformed by any of 200 solutions obtained
with NSGA-II (Deb et al. 2000) after convergence. This numerically validates the
proposition in Sect. 3.2, which asserts that the proposed method can derive a Pareto
optimal solution.
In addition, we can use the result obtained for the post-optimal analysis combined
with a classical multi-objective analysis method, such as the " constraint method,
or recent approaches such as elite-induced evolutionary multi-objective analysis
(Shimizu et al. 2012b). As illustrated in Fig. 3.5, by producing several solutions
around the optimal result, we can move on to the next stage by choosing among
those candidates to make a final decision for actual execution. Based on the above
discussion, we again emphasize the validity of the proposed approach.

3.5 Conclusion

A MAOP method that supports flexible and adaptive decision making for complex,
diverse and competitive decision environments has been in high demand. From
this viewpoint, this study proposed a general idea for solving MAOP problems by
extending our previously proposed MUOP methods (MOON2 and MOON2R ).
64 Y. Shimizu

Although MOON2 and MOON2R require only simple and relative responses,
handling the DM’s responses in trade-off analysis becomes rather difficult in MAOP,
where more than a few objective functions are to be considered simultaneously. To
overcome this difficulty, this study proposed an approach that is easily applicable
in such cases. After presenting the general procedure, the effectiveness of the
proposed method was verified by applying it to an actual problem. The experimental
results showed that the proposed method is moderately more complex than previous
methods but maintains flexibility and adaptability. Finally, the general discussion
provided a definite and comprehensive outline of the direction of future work in
this area.

References

Coello CAC (2001) A short tutorial on evolutionary multiobjective optimization. In: Zitzler E et al
(eds) Lecture notes in computer science. Springer, Berlin, pp 21–40
Czyzak P, Jaszkiewicz AJ (1998) Pareto simulated annealing-a meta-heuristic technique for
multiple-objective combinatorial optimization. J Multi-Criteria Decis Anal 7:34–47
Deb K, Agrawal S, Pratap A, Meyarivan T (2000) A fast elitist non-dominated sorting genetic
algorithm for multi-objective optimization: NSGA-II. In: Proceedings Parallel Problem Solving
from Nature VI (PPSN-VI). Paris, France, pp 849–858
Erfani T, Utyuzhnikov SV (2012) Control of robust design in multiobjective optimization under
uncertainties. Struct Multidiscip Optim 45:247–256
Fishburn PC (1970) Utility theory for decision making. Wiley, New York
Hughes EJ (2005) Evolutionary many-objective optimization. Many once or one many? In:
Proceedings IEEE Congress on Evolutionary Computation (CEC2005). Edinburgh, UK,
pp 222–227
Jaeggi D, Parks G, Kipouros T, Clarkson J (2005) A multiobjective tabu search algorithm
for constrained optimization problems. In: Evolutionary multi-criterion optimization, third
international conference, EMO 2005, LNCS 3410. Guanajuato, Mexico, pp 490–504
Nelder JA, Mead R (1965) A simplex method for functional minimization. Comput J 7:308–313
Robic T, Filipic B (2005) DEMO: differential evolution for multi-objective optimization. In: Evo-
lutionary multi-criterion optimization, third international conference (EMO 2005), Guanajuato.
LNCS 3410, pp 520–533
Saaty TL (1980) The analytic hierarchy process. McGraw-Hill, New York
Sato H, Aguirre H, Tanaka K (2010) Many-objective evolutionary optimization by self-controlling
dominance area of solutions. Trans Jpn Soc Evol Comput 1(1):32–41
Shimizu Y (2010) An enhancement of learning optimization engineering – workbench for smart
decision making. CORONA Publishing Co., LTD. (In Japanese)
Shimizu Y, Kawada A (2002) Multi-objective optimization in terms of soft computing. Trans Soc
Instrum Control Eng 38(11):974–980
Shimizu Y, Nomachi T (2008) Integrated product design through multi-objective optimization
incorporated with meta-modeling technique. J Chem Eng Jpn 41(11):1068–1074
Shimizu Y, Tanaka Y (2003) A practical method for multi-objective scheduling through soft
computing approach. JSME Int J Ser C 46(1):54–59
Shimizu Y, Tanaka Y, Kawada A (2004) Multi-objective optimization system on the internet.
Comput Chem Eng 28(5):821–828
Shimizu Y, Miura K, Yoo J-K, Tanaka Y (2005) A progressive approach for multi-objective design
through inter-related modeling of value system and meta-model. J JSME Ser C 71(712):296–
303. (In Japanese)
3 An Extension of the MOON2 /MOON2R Approach to Many-Objective. . . 65

Shimizu Y, Yoo J-K, Tanaka Y (2006) A design support through multi-objective optimization aware
of subjectivity of value system. J JSME Ser C 72(717):1613–1620. (In Japanese)
Shimizu Y, Kato Y, Kariyahara T (2010) Prototype development for supporting multiobjective
decision making in an ill-posed environment. J Chem Eng Jpn 43(8):691–697
Shimizu Y, Waki T, Sakaguchi T (2012a) Multi-objective sequencing optimization for mixed-
model assembly line considering due-date satisfaction. J Adv Mech Des Syst Manuf 6(7):1057–
1070
Shimizu Y, Takayama M, Ohishi H (2012b) Multi-objective analysis through elite-induced
evolutionary algorithm – in the case of PSA. Trans Jpn Soc Evol Comput 3(2):22–30. (In
Japanese)

Yoshiaki Shimizu is a Professor in Department of Mechanical Engineering, Toyohashi University


of Technology, Japan. He was responsible for the head of department of Production Systems
Engineering during 2006–2009.
He was graduated from Kyoto University in Japan, and earned Doctor of Engineering in 1982.
His teaching and research interests include production systems and supply chain management,
multi-objective optimization and applied operations research. He is the author of more than 200
academic and technical papers and books. See more detail on his home page (URL http://ise.me.
tut.ac.jp/). His email address is shimizu@me.tut.ac.jp
Chapter 4
Existence of Solutions for Variational-Like
Hemivariational Inequalities Involving Lower
Semicontinuous Maps

Guo-ji Tang, Zhong-bao Wang, and Nan-jing Huang

Abstract The main aim of this chapter is to investigate the existence of solutions in
connection with a class of variational-like hemivariational inequalities in reflexive
Banach spaces. Some existence theorems of solutions for the variational-like
hemivariational inequalities involving lower semicontinuous set-valued maps are
proved under different conditions. Moreover, a necessary and sufficient condition
to guarantee the existence of solutions for the variational-like hemivariational
inequalities is also given.

Keywords Variational-like hemivariational inequality • Generalized monotonic-


ity • Set-valued map • Existence • Mosco’s alternative

2010 Mathematics Subject Classification: 49J40; 49J45; 47J20.

4.1 Introduction

Different from the fact that the variational inequality is mainly concerned with
convex energy functions, the hemivariational inequality, first introduced by Pana-
giotopulos (Panagiotopoulos 1983, 1985, 1991, 1993) in the early 1980s, is closely

G.-j. Tang
School of Science, Guangxi University for Nationalities, Nanning, Guangxi 530006, People’s
Republic of China
e-mail: guojvtang@126.com
Z.-b. Wang
Department of Mathematics, Southwest Jiaotong University, Chengdu, 610031, People’s
Republic of China
e-mail: zhongbaowang@hotmail.com
N.-j. Huang ()
Department of Mathematics, Sichuan University, Chengdu, Sichuan 610064, People’s
Republic of China
e-mail: nanjinghuang@hotmail.com

© Springer-Verlag Berlin Heidelberg 2015 67


H. Xu et al. (eds.), Optimization Methods, Theory and Applications,
DOI 10.1007/978-3-662-47044-2_4
68 G.-j. Tang et al.

concerned with nonsmooth and nonconvex energy functions. This type of inequal-
ities and their generalization play a crucial role in describing many important
problems arising in mechanics and engineering, such as unilateral contact problems
in nonlinear elasticity, thermoviscoelastic frictional contact problems and obstacles
problems (see, for example, Carl et al. 2007; Naniewicz and Panagiotopoulos 1995;
Motreanu and Rădulescu 2003; Panagiotopoulos 1985, 1993 and the references
therein). The derivative of hemivariational inequality is based on the generalized
directional derivative introduced by Clarke (1983). In the past of almost 30 years, the
theory of hemivariational inequalities has been developed a great deal of important
results both in pure and applied mathematics as well as in other fields such as
mechanics and engineering sciences, since it allowed mathematical formulations
for some interesting problems (Carl 2001; Carl et al. 2005; Costea and Rădulescu
2009, 2010; Costea and Lupu 2010; Costea 2011; Costea et al. 2012; Costea and
Rădulescu 2012; Liu 2008; Migórski and Ochal 2004; Motreanu and Rădulescu
2000; Xiao and Huang 2009, 2008; Xiao et al. 2014; Zhang and He 2011).
On the other hand, Parida et al. (1989) introduced another new type of variational
inequality, called variational-like inequality, and showed that it can be related
to some mathematical programming problems. For more related work regarding
variational-like inequalities, we refer to Fang and Huang (2003), Bai et al. (2006),
Ansari and Yao (2001) and the references therein.
Let K be a nonempty, closed and convex subset in a real reflexive Banach space
X. Assume that A W K ⇒ X  is a set-valued map,  W K  K ! X is a single-valued
map and  W X ! R [ fC1g is a convex and lower semicontinuous functional such
that K WD K \ dom  ¤ ;, where dom  WD fx 2 X W .x/ < C1g is the effective
domain of . Let  be a bounded open set in RN and j.x; y/ W   Rk ! R be a
function. Let T W X ! Lp .I Rk / be a linear and continuous mapping, where 1 <
p < 1. We shall denote uO WD Tu and denote by jı .x; yI h/, the Clarke’s generalized
directional derivative of a locally Lipschitz mapping j.x; / at the point y 2 Rk with
respect to the direction h 2 Rk , where x 2 . We are interested in finding solutions
for the following problem:
(P) Find u 2 K such that

8u 2 A.u/ W hu ; .v; u/i C .v/  .u/


Z
C jı .x; uO .x/I v.x/
O  uO .x//dx  0; 8v 2 K; (4.1)


which is related closely to the following problem:


Find u 2 K such that

9u 2 A.u/ W hu ; .v; u/i C .v/  .u/


Z
C jı .x; uO .x/I v.x/
O  uO .x//dx  0; 8v 2 K: (4.2)


It is clear that a solution of problem (P) is necessarily the solution of problem (4.2)
and the converse relation is not true in general. Particularly, if A is a single-valued
4 Existence of Solutions for Variational-Like Hemivariational Inequalities. . . 69

mapping, then problems (P) coincides with (4.2) . Sometimes, a solution of problem
(P) is called as a strong solution of problem (4.2) (the similar notions can be referred
to Costea et al. 2012; Tang et al. 2014). To our best knowledge, the strong solution
of hemivariational inequalities involving set-valued maps (for example, a solution
of problem (P) other than (4.2)) was considered in few papers (Tang et al. 2014).
Moreover, we would like to mention that problems (P) and (4.2) are more general
ones because they include some problems as special cases such as:
• If j is a constant on   Rk , then problems (P) and (4.2) become, respectively, as
follows:

8u 2 A.u/ W hu ; .v; u/i C .v/  .u/  0; 8v 2 K (4.3)

and

9u 2 A.u/ W hu ; .v; u/i C .v/  .u/  0; 8v 2 K; (4.4)

which were considered by Costea et al. (2012). If, in addition, .v; u/ D v  u,


then problems (4.3) and (4.4) become to

8u 2 A.u/ W hu ; v  ui C .v/  .u/  0; 8v 2 K (4.5)

and

9u 2 A.u/ W hu ; v  ui C .v/  .u/  0; 8v 2 K; (4.6)

which were called (generalized) mixed variational inequalities and studied


extensively by many authors (see, for example, Tang and Huang (2014, 2013a)
and the references therein).
If .v; u/ D v  u, problem (4.2) reduces to the following problem:

9u 2 A.u/ W hu ; v  ui C .v/  .u/


Z
C jı .x; uO .x/I v.x/
O  uO .x//dx  0; 8v 2 K; (4.7)


which is called variational hemivariational inequality (see, for example, Costea


and Lupu 2010; Tang and Huang 2013b).
• If A is single-valued and  D IK , the indicator function on the constraint set K,
then problems both (P) and (4.2) reduce to the problem:
Z
hA.u/; .v; u/i C jı .x; uO .x/I v.x/
O  uO .x//dx  0; 8v 2 K; (4.8)


which was considered by Costea and Rădulescu (2009).


70 G.-j. Tang et al.

• If A is single-valued and .v; u/ D v u, then problems both (P) and (4.2) reduce
to the problem:
Z
hA.u/; vuiC.v/.u/C jı .x; uO .x/I v.x/
O uO .x//dx  0; 8v 2 K; (4.9)


which was studied by Motreanu and Rădulescu (2000). If, in addition,  D IK ,


then problem (4.9) becomes to
Z
hA.u/; v  ui C jı .x; uO .x/I v.x/
O  uO .x//dx  0; 8v 2 K;


which was introduced and studied by Panagiotopoulos et al. (1999).


Extensive attention has been paid to the existence results for some types of
hemivariational inequalities by many researchers in recent years (see, for example,
Carl 2001; Carl et al. 2005, 2007; Xiao and Huang 2009; Migórski and Ochal
2004; Park and Ha 2008, 2009; Goeleven et al. 1998; Liu 2008; Zhang and He
2011; Tang and Huang 2013b; Costea and Lupu 2010; Xiao and Huang 2008;
Costea and Rădulescu 2009 and the references therein). In particular, some authors
considered some classes of variational-like hemivariational inequalities (see, for
example Costea and Rădulescu 2009; Xiao and Huang 2008). It is also worth
mentioning that, under some generalized monotonicity assumptions, Costea et al.
(2012) investigated some results concerned with the existence of solutions for
problems (4.3) and (4.4) involving set-valued mappings.
In this chapter, we continue to study the existence of solutions for problem (P)
involving lower semicontinuous set-valued maps in reflexive Banach spaces. We
prove the existence of solutions for problem (P) when K is compact convex and
bounded closed convex, respectively. In the case when K is unbounded, we study
the existence of solutions and the boundedness of the solution set for problem (P)
under some coercivity conditions. Moreover, a necessary and sufficient condition to
the existence of solutions for problem (P) is also derived. We would like to point
out that the results presented in this chapter generalize and improve some known
results due to Costea and Rădulescu (2009), Costea and Lupu (2010), Costea et al.
(2012), Motreanu and Rădulescu (2000), Panagiotopoulos et al. (1999), and Tang
and Huang (2013b).

4.2 Preliminaries

Let X be a reflexive Banach space with the norm denoted by k k, X  be its dual
space. For a nonempty, closed and convex subset K of X and every r > 0, we define

Kr WD fu 2 K W kuk  rg and Kr WD fu 2 K W kuk < rg:


4 Existence of Solutions for Variational-Like Hemivariational Inequalities. . . 71

Let T W X ! Lp .I Rk / be a linear compact operator, where 1 < p < 1 and k  1,


and  be a bounded open set in RN . Denote by q the conjugated exponent of p, i.e.,
1 1
p C q D 1.
Recall that f ı .xI v/ denotes Clarke’s generalized directional derivative of the
locally Lipschitz mapping f W X ! R at the point x 2 X with respect to the direction
v 2 X, while @f .x/ is the Clarke’s generalized gradient of f at x 2 X (see, for
example Clarke 1983), i.e.,
f .y C tv/  f .y/
f ı .xI v/ D lim sup
y!x;t!0C t

and

@f .x/ D f 2 X  W h; vi  f ı .xI v/; 8v 2 Xg:

Lemma 4.1 (Proposition 2.1.1 of Clarke 1983). Let f W K ! R be Lipschitz of


rank M near x. Then
(i) The function v ! f ı .xI v/ is finite, positively homogeneous and subadditive
on X, and satisfies

jf ı .xI v/j  MkvkI

(ii) f ı .xI v/ is upper semicontinuous as a function of .x; v/ and, as a function of v


alone, is Lipschitz of rank M on X;
(iii) f ı .xI v/ D .f /ı .xI v/.
In order to solve problem (P), we need the following hypotheses about j.
(Hj ) Let j W   Rk ! R be a function which satisfies:
(i) For every y 2 Rk , j. ; y/ W  ! R is measurable;
(ii) For all x 2 , the mapping j.x; / is locally Lipschitz;
(iii) There exists C > 0 such that

jzj  C.1 C jyjp1 /; 8x 2 ; 8z 2 @j.x; y/:

Now we consider the mapping J W Lp .I Rk / ! R defined by


Z
J.'/ D j.x; '.x//dx: (4.10)


Under the hypotheses (Hj ), we can apply the Aubin-Clarke theorem (see e.g. Aubin
and Clarke 1979 or Motreanu and Rădulescu 2003) to conclude that the functional
J defined above is locally Lipschitz and
Z
J ı .wI z/  jı .x; w.x/I z.x//dx; 8w; z 2 Lp .I Rk /:

72 G.-j. Tang et al.

Consequently,
Z
J ı .OuI v/
O  jı .x; uO .x/I v.x//dx;
O 8u; v 2 X: (4.11)


Definition 4.1. Let E and F be two Hausdorff topological spaces. A set-valued map
T W E ⇒ F is said to be
(i) Lower semicontinuous at x0 iff, for any open set V  F such that T.x0 / \ V ¤
;, we can find a neighborhood U of x0 such that T.x/ \ V ¤ ; for all x 2 U;
(ii) Lower semicontinuous iff, it is lower semicontinuous at each x 2 E;
(iii) Lower semicontinuous iff, the restriction of T to every line segment of K is
lower semicontinuous.
We denote by G.T/ WD f.x; y/ W x 2 E and y 2 T.x/g the graph of T. It is well
known that there is an equivalent characterization for a lower semicontinuous maps
(see, for example, item (i) of Proposition 2.1 of Costea et al. 2012).
Lemma 4.2. Let E and F be two Hausdorff topological spaces. Then a set-valued
map T W E ⇒ F is lower semicontinuous iff, for any pair .x; y/ 2 G.T/ and any
net fx g 2I  E converging to x, we can determine, for each 2 I, an element
y 2 T.x / such that y ! y.
The following result is a fixed point theorem for set-valued maps due to Ansari
and Yao (1999), which plays an important role in proving the existence of solutions
of problem (P) in the case of compact convex subsets in reflexive Banach spaces.
Lemma 4.3. Let K be a nonempty, closed and convex subset of a Hausdorff
topological vector space E and let S; T W K ⇒ K be two set-valued maps. Assume
that
• For each x 2 K, S.x/ be nonempty and convfS.x/g T.x/;
• K D [y2K intK S1 .y/;
• If K is not compact, there exists a nonempty, compact and convex subset C0 of K
and a nonempty and compact subset C1 of K such that, for each x 2 KnC1 , there
exists yN 2 C0 with the property that x 2 intK S1 .Ny/.
Then there exists x0 2 K such that x0 2 T.x0 /.
The next lemma is known as Mosco’s Alternative (see Mosco 1976) and plays a
crucial role in proving the existence theorems for problem (P) in the next section.
Lemma 4.4 (Mosco’s Alternative). Let K be a nonempty, compact and convex
subset of a topological space E and assume  W E ! Rn [ fC1g is a proper,
convex and lower semicontinuous function such that K ¤ ;. Let ;  W E  E ! R
be two functions such that
• .x; y/  .x; y/ for all x; y 2 E;
• For each x 2 E, the map y 7! .x; y/ is lower semicontinuous;
• For each y 2 E, the map x 7! .x; y/ is concave.
4 Existence of Solutions for Variational-Like Hemivariational Inequalities. . . 73

Then, for each 2 R, the following alternative holds true: either there exists y0 2
K such that

.x; y0 / C .y0 /  .x/  ; for all x 2 E;

or there exists x0 2 E such that .x0 ; x0 / > .


Definition 4.2. Let  W K  K ! X and ˛ W X ! R be two single-valued maps.
A set-valued map T W K ⇒ X  is said to be relaxed   ˛ monotone iff, for all
u; v 2 K, all v  2 T.v/ and all u 2 T.u/, one has

hv   u ; .v; u/i  ˛.v  u/: (4.12)

Remark 4.1. If ˛ D 0 in (4.12), then T is said to be monotone. If .u; v/ D u v


in (4.12), then T is said to be relaxed ˛ monotone. If .u; v/ D u  v and ˛.z/ D
kkzkp with constants k > 0 and p > 1 in (4.12), then T is said to be pmonotone,
if, in addition, p D 2, then T is called strongly monotone. If .u; v/ D u  v and
˛ D 0 in (4.12), then T is said to be monotone.
For some examples related to relaxed -˛ monotone mappings, the readers can
be referred to Costea et al. (2012).

4.3 Existence Theorems

In order to prove our existence results, we shall use some of the following
hypotheses, which have ever been extensively used in recent literatures (see, e.g.
Costea and Rădulescu 2009, 2012; Costea et al. 2012):
(H1A ) A W K ⇒ X  is a set-valued mapping which is lower semicontinuous from K
with the strong topology into X  with the weak topology, and has nonempty
values;
(H2A ) A W K ⇒ X  is a set-valued mapping which is lower semicontinuous from K
with the strong topology into X  with the weak topology, and has nonempty
values;
(H1 )  W K  K ⇒ X is such that
(i) For all v 2 K, the map u 7! .v; u/ is continuous;
(ii) For all u; v; w 2 K and w 2 A.w/, the map v 7! hw ; .v; u/i is convex
and hw ; .v; u/i  0;
(H2 )  W K  K ⇒ X is such that
(i) .u; v/ C .v; u/ D 0 for all u; v 2 K;
(ii) For all u; v; w 2 K and w 2 A.w/, the map v 7! hw ; .v; u/i is convex
and lower semicontinuous;
74 G.-j. Tang et al.

(H )  W X ! R [ fC1g is a proper, convex and lower semicontinuous functional


such that K WD K \ D./ is nonempty;
(H˛ ) ˛ W X ! R is weakly lower semicontinuous functional such that
lim sup ˛. v/  0 for all v 2 X.
#0

In the sequel, we shall study three cases regarding the constraint set K:
1. K a nonempty, compact and convex subset of a real reflexive Banach space X;
2. K a nonempty, bounded, closed and convex subset of a real reflexive Banach
space X;
3. K a nonempty, unbounded, closed and convex subset of a real reflexive Banach
space X.
Theorem 4.1. Let X be a real reflexive Banach space and K a nonempty, compact
and convex subset of X. Assume that (H1A ), (H1 ), (H ) and (Hj ) hold. Then problem
(P) admits at least one solution.
Proof. Arguing by contradiction, let us assume that problem (P) has no solution.
Then, for each u 2 K , there exist uN  2 A.u/ and v D v.u; uN  / 2 K such that
Z
hNu ; .v; u/i C .v/  .u/ C jı .x; uO .x/I v.x/
O  uO .x//dx < 0: (4.13)


Now we define a functional J W Lp .I Rk / ! R as follows


Z
J.'/ D j.x; '.x//dx:


Thus, combining (4.13) and (4.11), we have

hNu ; .v; u/i C .v/  .u/ C J ı .OuI vO  uO / < 0: (4.14)

Clearly, the element v for which (4.14) takes place satisfies v 2 D./, therefore
v 2 K . We consider next a set-valued map F W K ⇒ K defined by

F.u/ D fv 2 K W hNu ; .v; u/i C .v/  .u/ C J ı .OuI vO  uO / < 0g;

where uN  2 A.u/ is some element that satisfies (4.14).


Claim 1. For each u 2 K , the set F.u/ is nonempty and convex.
Let u 2 K be arbitrarily fixed. Then (4.14) implies that F.u/ is nonempty. Let
v1 ; v2 2 F.u/ and define w D v1 C .1  /v2 with 2 .0; 1/. By item (ii) of (H1 ),
we have

hNu ; .w; u/i  hNu ; .v1 ; u/i C .1  /hNu ; .v2 ; u/i: (4.15)
4 Existence of Solutions for Variational-Like Hemivariational Inequalities. . . 75

Since f ı . I / is positively homogeneous and subadditive (see item (i) of Lemma 4.1)
and T is linear, we get

J ı .OuI wO  uO / D J ı .OuI .vO1  uO / C .1  /.vO2  uO //


 J ı .OuI vO1  uO / C .1  /J ı .OuI vO2  uO /: (4.16)

Combining (4.15), (4.16) and the convexity of , we conclude that

hNu ; .w; u/i C .w/  .u/ C J ı .OuI wO  uO /


D ŒhNu ; .v1 ; u/i C .v1 /  .u/ C J ı .OuI vO1  uO /
C.1  /ŒhNu ; .v2 ; u/i C .v2 /  .u/ C J ı .OuI vO2  uO /
 0; (4.17)

which shows that w 2 F.u/. Therefore, F.u/ is a nonempty and convex subset of K .
Claim 2. For each v 2 K , the set F 1 .v/ D fu 2 K W v 2 F.u/g is open.
Let us fix v 2 K . Taking into account that

F 1 .v/ D fu 2 K W 9Nu 2 A.u/ s.t. hNu ; .v; u/iC.v/.u/CJ ı.OuI vO  uO / < 0g;

we shall prove

ŒF 1 .v/c Dfu 2 K W hu ; .v; u/iC.v/.u/CJ ı .OuI vO


O u/0 for all u 2 A.u/g

is a closed subset of K . Let fu g 2I  ŒF 1 .v/c be a net converging to some


u 2 K . Then, for each 2 I, we have

hu ; .v; u /i C .v/  .u / C J ı .uO I vO  uO /  0 for all u 2 A.u /: (4.18)

By item (i) of (H1 ), one has

.v; u / ! .v; u/: (4.19)

For each u 2 A.u/ and 2 I, applying item (i) of Lemma 4.2, we can determine
u 2 A.u / such that

u * u

since A is lower semicontinuous from K with the strong topology into X  with the
weak topology. This, together with (4.19), shows that

hu ; .v; u /i ! hu ; .v; u/i: (4.20)


76 G.-j. Tang et al.

Since T is linear and u ! u and u ! u, we know that uO ! uO and vO  uO ! vO  uO .


By item (ii) of Lemma 4.1, we have

lim sup J ı .uO I vO  uO /  J ı .OuI vO  uO /: (4.21)

Using (4.18), (4.20), (4.21) and the lower semicontinuity of , one has

0  lim supŒhu ; .v; u /i C .v/  .u / C J ı .uO I vO  uO /


 lim suphu ; .v; u /i C .v/  lim inf .u / C lim sup J ı .uO I vO  uO /
 hu ; .v; u/i C .v/  .u/ C J ı .OuI vO  uO /; (4.22)

which shows that u 2 ŒF 1 .v/c and so ŒF 1 .v/c is a closed subset of K .


Claim 3. K D [v2K intK F 1 .v/.
Since F 1 .v/ is a subset of K for all v 2 K , it is easy to see that
[v2K intK F 1 .v/ K . Now we prove that K [v2K intK F 1 .v/. For each
u 2 K , there exits v 2 K such that v 2 F.u/ (such a v exists since F.u/ is
nonempty) and so

u 2 F 1 .v/ [v2K F 1 .v/ D [v2K intK F 1 .v/:

The compactness of K and the above claims ensure that all the conditions of
Lemma 4.3 are satisfied for S D T D F. Thus, we deduce that the set-valued map
F W K ⇒ K admits a fixed point u0 2 K , i.e., u0 2 F.u0 /. This can be rewritten
equivalently as

0 D huN0  ; .u0 ; u0 /i C .u0 /  .u0 / C J ı .uO0 I uO0  uO0 / < 0:

Thus, we get a contradiction which completes the proof.


Remark 4.2. If j is a constant on   Rk , then Theorem 4.1 reduces to the
corresponding result of Theorem 3.2 presented by Costea et al. (2012).
Theorem 4.2. Let K be a nonempty, bounded, closed and convex subset of the real
reflexive Banach space X. Let A W K ⇒ X  be a relaxed -˛ monotone map and
assume that (H2A ), (H2 ), (H˛ ), (H ) and (Hj ) hold. Then problem (P) admits at least
one solution.
Proof. In order to prove the conclusion, we shall apply Mosco’s Alternative for the
weak topology. First, we note that K is weakly compact as it is a bounded, closed
and convex subset of the real reflexive space X and  W X ! R [ fC1g is weakly
lower semicontinuous as it is convex and lower semicontinuous. Now we define
three functionals J W Lp .I Rk / ! R and ;  W X  X ! R as follows:
4 Existence of Solutions for Variational-Like Hemivariational Inequalities. . . 77

Z
J.'/ D j.x; '.x//dx;


.v; u/ D  inf hv  ; .v; u/i  J ı .OuI vO  uO / C ˛.v  u/;


v 2A.v/

and

.v; u/ D sup hu ; .u; v/i  J ı .OuI vO  uO /:


u 2A.u/

Let us fix u; v 2 X and choose vN  2 A.v/ such that

hvN  ; .v; u/i D inf hv  ; .v; u/i:


v 2A.v/

For arbitrary fixed u 2 A.u/, we have

.v; u/  .v; u/
D sup hu ; .u; v/i C inf hv  ; .v; u/i  ˛.v  u/
u 2A.u/ v 2A.v/

 hu ; .u; v/i C hvN  ; .v; u/i  ˛.v  u/


D hvN   u ; .v; u/i  ˛.v  u/
 0: .by the relaxed   ˛ monotonicity/

Clearly,

 inf hv  ; .v; u/i D sup hv  ; .u; v/i:


v 2A.v/ v 2A.v/

It follows from conditions (H2 ) and (H˛ ) that the map defined by

u 7!  inf hv  ; .v; u/i C ˛.v  u/


v 2A.v/

is weakly lower semicontinuous. Furthermore, since T is a linear and compact


operator, we know that un * u implies uOn ! uO and so lim sup J ı .uOn I vO  uOn / 
n!1
J ı .OuI vO  uO / by item (ii) of Lemma 4.1. Therefore, u 7! .v; u/ is weakly lower
semicontinuous. By the fact that T is linear and item (i) of Lemma 4.1, we conclude
that v 7! J ı .OuI vO  uO / is convex. This, together with assumption (H2 ), implies that
v 7! .v; u/ is concave. Since .v; v/ D 0 for all v 2 X, by Mosco’s Alternative for
D 0, we conclude that there exists u0 2 K such that

.v; u0 / C .u0 /  .v/  0; for all v 2 X:


78 G.-j. Tang et al.

A simple computation shows that, for each w 2 K, we have

hw ; .w; u0 /i C J ı .uO0 I wO  uO0 / C .w/  .u0 /  ˛.w  u0 /;


for all w 2 A.w/:
(4.23)
Let us fix v 2 K and define w D u0 C .v  u0 / with 2 .0; 1/. By the convexity
of K , we know that w 2 K . Then, for each w 2 A.w /, from (4.23), we have

˛. .v  u0 //
 hw ; .w ; u0 /i C J ı .uO0 I wO  uO0 / C .w /  .u0 /
 hw ; .v; u0 /i C .1  /hw ; .u0 ; u0 /i C J ı .uO0 I vO  uO0 /
C.1  /J ı .uO0 I uO0  uO0 / C .v/ C .1  /.u0 /  .u0 /
D Œhw ; .v; u0 /i C J ı .uO0 I vO  uO0 / C .v/  .u0 /;

which leads to

˛. .v  u0 //
 hw ; .v; u0 /i C J ı .uO0 I vO  uO0 / C .v/  .u0 /: (4.24)

For each u0 2 A.u0 /, combining the fact that w ! u0 as # 0 with the fact that
A is semicontinuous, we deduce that, for each 2 .0; 1/, we can find w 2 A.w /
such that w * u0 as # 0. Taking the superior limit in (4.24) as # 0 and
keeping (H˛ ) in mind, we get

˛. .v  u0 //
0  lim sup
#0

 lim supŒhw ; .v; u0 /i C J ı .uO0 I vO  uO0 / C .v/  .u0 /


#0

D hu0 ; .v; u0 /i
C J ı .uO0 I vO  uO0 / C .v/  .u0 / (4.25)
Z
 hu0 ; .v; u0 /i C .v/  .u0 / C jı .x; uO0 .x/I v.x/
O  uO0 .x//dx: (by (4.11))


Therefore, we have

8u0 2 A.u0 / W hu0 ; .v; u0 /i C .v/  .u0 /


Z
C jı .x; uO0 .x/I v.x/
O  uO0 .x//dx  0; 8v 2 K :


If v 2 KnD./, then .v/ D C1 and thus the inequality above holds


automatically. This, together with the inequality above, shows that u0 2 K is a
solution of problem (P). Thus, the proof is complete.
4 Existence of Solutions for Variational-Like Hemivariational Inequalities. . . 79

Remark 4.3. Theorem 4.2 generalizes some recent results in the following
aspects:
(i) If j is a constant on   Rk , then Theorem 4.2 reduces to the corresponding
result of Theorem 3.3 presented by Costea et al. (2012);
(ii) From the relation of solutions between problems (P) and (4.2), we know that,
under the same assumptions as Theorem 4.2, problem (4.2) necessarily admits
at least one solution. In this case, if, in addition, .u; v/ D u  v and ˛. / D 0,
then this conclusion reduces to Theorem 2 of Costea and Lupu (2010).
(iii) If A is single-valued, .u; v/ D u  v and ˛. / D 0, then Theorem 4.2 reduces
to Theorem 2 of Motreanu and Rădulescu (2000);
(iv) If A is single-valued, .u; v/ D u  v,  D IK and ˛. / D 0, then Theorem 4.2
reduces to Corollary 3.3 of Costea and Rădulescu (2009) (or see Theorem 2 of
Panagiotopoulos et al. 1999).
Let us turn our attention to the case when K is an unbounded, closed and convex
subset of X. In order to establish the existence results of problem (P), we need to
introduce the following coercivity conditions:
(C1) There exists r0 > 0 such that, for each u 2 K nKr0 , we can find v 2 K with
kvk  kuk such that

hu ; .v; u/i C .v/  .u/ C J ı .OuI vO  uO /  0; for all u 2 A.u/I (4.26)

(C2) There exists r0 > 0 such that, for each u 2 K nKr0 , we can find v 2 K with
kvk  kuk such that

hu ; .v; u/i C .v/  .u/


Z
C jı .x; uO .x/I v.x/
O  uO .x//dx < 0; for all u 2 A.u/:


Remark 4.4. (i) It is obvious that the implication (C2))(C1) holds as (4.11).
(ii) The conditions (C1) and (C2) can be regarded as generalization of some
coercivity conditions proposed recently by some authors. For example,
• If j is a constant on   Rk , then condition (C1) reduces to condition (H2 )
of Theorem 3.5 presented by Costea et al. (2012);
• If .v; u/ D vu, then conditions (C1) and (C2) reduce to conditions (B) and
(C) presented in Proposition 4.1 of Tang and Huang (2013b), respectively;
if, in addition,  D IK , then they become to conditions (B) and condition (C)
presented in Proposition 3.1 of Zhang and He (2011), respectively.
Theorem 4.3. Assume that all the assumptions of Theorem 4.2 hold except
the condition that K is bounded. If, in addition, the condition (C1) holds
for the functional J defined as (4.10), then problem (P) admits at least one
solution.
80 G.-j. Tang et al.

Proof. Let us fix r > r0 . Applying (4.25) of Theorem 4.2 as Kr is bounded, closed
and convex, we deduce that there exists ur 2 Kr \ D./ such that

8ur 2 A.ur / W hur ; .v; ur /i C .v/  .ur / C J ı .uOr I vO  uOr /  0; 8v 2 Kr :


(4.27)

(i) If kur k D r, then kur k > r0 . By condition (C1), we can find v0 2 K with
kv0 k < kur k such that

hur ; .v0 ; ur /i C .v0 /  .ur / C J ı .uOr I vO  uOr /  0; 8u 2 A.u/: (4.28)

Let v 2 K be arbitrarily fixed. Since kv0 k < kur k D r, we know that there
exists t 2 .0; 1/ such that vt WD v0 C t.v  v0 / 2 Kr \ D./. Note that T is
a linear mapping and  is convex. It follows from (4.27), item (i) of (H2 ) and
item (i) of Lemma 4.1 that

0  hur ; .vt ; ur /i C .vt /  .ur / C J ı .uOr I vO  uOr / (by (4.27))


 tŒhur ; .v; ur /i C .v/  .ur / C J ı .uOr I vO  uOr /
C.1  t/Œhur ; .v0 ; ur /i C .v0 /  .ur / C J ı .uOr I vO  uOr /
 tŒhur ; .v; ur /i C .v/  .ur / C J ı .uOr I vO  uOr /: (by (4.28))
(4.29)

Therefore, this together with t 2 .0; 1/ implies that

8ur 2 A.ur / W hur ; .v; ur /iC.v/.ur /CJ ı .uOr I vO  uOr /  0;


8v 2 K :
(4.30)
(ii) If kur k < r, then for each v 2 K , there is some t 2 .0; 1/ such that vt WD
ur C t.v  vr / 2 Kr \ D./. Note that T is a linear mapping and  is a convex
function. It follows from (4.27) and item (i) of Lemma 4.1, we have

0  hur ; .vt ; ur /i C .vt /  .ur / C J ı .uOr I vO  uOr / (by (4.27))


 tŒhur ; .v; ur /i C .v/  .ur / C J ı .uOr I vO  uOr /: (4.31)

Therefore, this, together with t 2 .0; 1/, shows that (4.30) also holds.
R
Since J.'/ D  j.x; '.x//dx, by (Hj ) and (4.11), we conclude that

8ur 2 A.ur / W hur ; .v; ur /i C .v/  .ur /


Z
C jı .x; uOr .x/I v.x/
O  uOr .x//dx  0; 8v 2 K : (4.32)

4 Existence of Solutions for Variational-Like Hemivariational Inequalities. . . 81

When v 2 KnD./, we have .v/ D C1 and thus the inequality in (4.32)


holds automatically. This fact, together with (4.30), shows that ur 2 Kr \ D./ is a
solution of problem (P). This completes the proof.
Remark 4.5. (i) If j is a constant on   Rk , then Theorem 4.3 reduces to the
corresponding result of Theorem 3.5 due to Costea et al. (2012);
(ii) Compared with Theorem 4.2 of Tang and Huang (2013b), the problem con-
sidered in the present paper is more general and the condition regarding the
set-valued map A is also different.
If the constraint set K is bounded, then the solution set of problem (P) is
obviously bounded. In the case when the constraint set K is unbounded, the
solution set of problem (P) may be unbounded. In the sequel, we provide a
sufficient condition to the boundedness of the solution set of problem (P) when
K is unbounded. The following theorem also generalizes corresponding results of
Tang and Huang (2013b) and Zhang and He (2011).
Theorem 4.4. Assume that all the assumptions of Theorem 4.2 hold except the
condition that K is bounded. If, in addition, the condition (C2) holds, then the
solution set of problem (P) is nonempty and bounded.
Proof. Applying Theorem 4.3 as the implication relation (C2))(C1), we know that
the solution set of problem (P) is nonempty. Now we prove that the solution set of
problem (P) is bounded. Assuming that the solution set is unbounded, then for any
positive r0 , there exists u0 2 K with ku0 k > r0 such that

8u0 2 A.u0 / W hu0 ; .v; u0 /i C .v/  .u0 /


Z
C j0 .x; uO0 .x/I v.x/
O  uO0 .x//dx  0; 8v 2 K: (4.33)


Since ku0 k > r0 , by the condition (C2), we know that there exists v0 2 K with
kv0 k < ku0 k such that
Z
8u0 2 A.u0 / W hu ; .v0 ; u0 /iC.v0 /.u0 /C j0 .x; uO0 .x/I vO0 .x/uO0 .x//dx < 0;


which contradicts with (4.33). Thus, it follows that the solution set is bounded,
completing the proof.
Using a similar technique to the one used in Panagiotopoulos et al. (1999), Costea
(2011), and Tang and Huang (2013b), we can provide a necessary and sufficient
condition for problem (P) and get the following result.
Theorem 4.5. Let T W X ! Lp .I Rk / be a linear compact operator, where 1 <
p < 1, k  1 and  is a bounded open set in RN . Let K be a nonempty, closed
and convex subset of X. Assume that assumptions (H2 ), (H ) and (Hj ) hold. Then a
82 G.-j. Tang et al.

necessary and sufficient condition for problem (P) to have a solution is that there
exists a constant r > 0 with the property that at least one solution of the problem:
(Pr ) find ur 2 Kr \ D./ and such that

8ur 2 A.ur / W hur ; .v; ur /i C .v/  .ur /


Z
C jı .x; uOr .x/I v.x/
O  uOr .x//dx  0; 8v 2 Kr ; (4.34)


satisfies the inequality kur k < r.


Proof. The necessity is obvious.
Now we show the sufficiency. Suppose that there exists a solution ur of problem
(Pr ) with kur k < r. We shall prove that ur is a solution of problem (P). For any
fixed v 2 K, since kur k < r, we can choose " > 0 small enough such that w D
ur C ".v  ur / satisfies kwk < r. By assumption (H2 ), we have

hur ; .w; ur /i  "hur ; .v; ur /i C .1  "/hur ; .ur ; ur /i D "hur ; .v; ur /i: (4.35)

It follows from item (i) of Lemma 4.1 and the linearity of T that
Z
jı .x; uOr .x/I w.x/
O  uOr .x//dx

Z Z
" jı .x; uOr .x/I v.x/
O  uOr .x//dx C .1  "/ jı .x; uOr .x/I uOr .x/  uOr .x//dx
 
Z
D" jı .x; uOr .x/I v.x/
O  uOr .x//dx: (4.36)


Applying (4.34) for v D w and assumption (H ) and combining (4.35) and (4.36),
one has

8ur 2 A.ur / W "Œhur ; .v; ur /i C .v/  .ur /


Z
C jı .x; uOr .x/I v.x/
O  uOr .x//dx  0; 8v 2 K:


Dividing by " > 0, it follows that ur is a solution of problem (P). The proof is
complete.
Remark 4.6. For a suitable choice of maps and functionals such as A;  and j, it is
easy to see that Theorem 4.5 can be reduced to Theorem 4.4 of Tang and Huang
(2013b) and Theorem 3 of Panagiotopoulos et al. (1999).

Acknowledgements This work was supported by the National Natural Science Foundation of
China (11171237), Guangxi Natural Science Foundation (2013GXNSFBA019015), Scientific
Research Foundation of Guangxi Department of Education (ZD2014045), Outstanding Young and
4 Existence of Solutions for Variational-Like Hemivariational Inequalities. . . 83

Middle-aged Backbone Teachers Training Project of Guangxi Colleges and Universities (Gui-Jiao-
Ren 2014-39) and Open Fund of Guangxi Key Laboratory of Hybrid Computation and IC Design
Analysis (HCIC201308).

References

Ansari QH, Yao JC (1999) A fixed point theorem and its applications to a system of variational
inequalities. Bull Aust Math Soc 59:433–442
Ansari QH, Yao JC (2001) Iterative schemes for solving mixed variational-like inequalities. J
Optim Theory Appl 108:527–541
Aubin JP, Clarke FH (1979) Shadow prices and duality for a class of optimal control problems.
SIAM J Control Optim 17:567–586
Bai MR, Zhou SZ, Ni GY (2006) Variational-like inequalities with relaxed -˛ pseudomonotone
mappings in Banach spaces. Appl Math Lett 19:547–554
Carl S (2001) Existence of extremal solutions of boundary hemivariational inequalities. J Differ
Eqn 171:370–396
Carl S, Le VK, Motreanu D (2005) Existence and comparison principles for general quasilinear
variational-hemivariational inequalities. J Math Anal Appl 302:65–83
Carl S, Le VK, Motreanu D (2007) Nonsmooth variational problems and their inequalities,
comparison principles and applications. Springer, New York
Clarke FH (1983) Optimization and nonsmooth analysis. Wiley, New York
Costea N, Rădulescu V (2009) Existence results for hemivariational inequalities involving relaxed
-˛ monotone mappings. Comm Appl Anal 13:293–304
Costea N, Rădulescu V (2010) Hartman-Stampacchia results for stably pseudomonotone operators
and nonlinear hemivariational inequalities. Appl Anal 89:175–188
Costea N, Lupu C (2010) On a class of variational-hemivariational inequalities involving set valued
mappings. Adv Pure Appl Math 1:233–246
Costea N (2011) Existence and uniqueness results for a class of quasi-hemivariational inequalities.
J Math Anal Appl 373:305–315
Costea N, Ion DA, Lupu C (2012) Variational-like inequality problems involving set-valued maps
and generalized monotonicity. J Optim Theory Appl 155:79–99
Costea N, Rădulescu V (2012) Inequality problems of quasi-hemivariational type involving set-
valued operators and a nonlinear term. J Glob Optim 52:743–756
Fang YP, Huang NJ (2003) Variational-like inequalities with generalized monotone mappings in
Banach spaces. J Optim Theory Appl 118:327–338
Goeleven D, Motreanu D, Panagiotopoulos D (1998) Eigenvalue problems for variational-
hemivariational inequalities at resonance. Nonlinear Anal 33:161–180
Liu ZH (2008) Existence results for quasilinear parabolic hemivariational inequalities. J Differ Eqn
244:1395–1409
Migórski S, Ochal A (2004) Boundary hemivariational inequality of parabolic type. Nonlinear
Anal 57:579–596
Mosco U (1976) Implicit variational problems and quasi-variational inequalities. In: Gossez JP,
LamiDozo EJ, Mawhin J, Waelbroek L (eds) Nonlinear operators and the calculus of variations.
Lecture notes in mathematics, vol 543. Springer, Berlin, pp 83–56
Motreanu D, Rădulescu V (2000) Existence results for inequality problems with lack of convexity.
Numer Funct Anal Optim 21:869–884
Motreanu D, Rădulescu V (2003) Variational and non-variational methods in nonlinear analysis
and boundary value problems. Kluwer Academic, Boston/Dordrecht/London
Naniewicz Z, Panagiotopoulos PD (1995) Mathematical theory of hemivariational inequalities and
applications. Marcel Dekker, New York
84 G.-j. Tang et al.

Panagiotopoulos PD (1983) Nonconvex energy functions, hemivariational inequalities and substa-


tionarity principles. Acta Mech 42:160–183
Panagiotopoulos PD (1985) Inequality problems in mechanics and applications, convex and
nonconvex energy functions. Birkhäser, Basel
Panagiotopoulos PD (1991) Coercive and semicoercive hemivariational inequalities. Nonlinear
Anal 16:209–231
Panagiotopoulos PD (1993) Hemivariational inequalities, applications in mechnics and engineer-
ing. Springer, Berlin
Panagiotopoulos PD, Fundo M, Rădulescu V (1999) Existence theorems of Hartman-Stampacchia
type for hemivariational inequalities and applications. J Glob Optim 15:41–54
Parida J, Sahoo M, Kumar A (1989) A variational-like inequality problem. Bull Aust Math Soc
39:225–231
Park JY, Ha TG (2008) Existence of antiperiodic solutions for hemivariational inequalities.
Nonlinear Anal 68:747–767
Park JY, Ha TG (2009) Existence of anti-periodic solutions for quasilinear parabolic hemivaria-
tional inequalities. Nonlinear Anal 71:3203–3217
Tang GJ, Huang NJ (2013a) Gap functions and global error bounds for set-valued mixed variational
inequalities. Taiwan J Math 17:1267–1286
Tang GJ, Huang NJ (2013b) Existence theorems of the variational-hemivariational inequalities. J
Glob Optim 56:605–622
Tang GJ, Huang NJ (2014) Strong convergence of an inexact projected subgradient method for
mixed variational inequalities. Optimization 63:601–615
Tang GJ, Wang X, Wang ZB (2014) Existence of variational quasi-hemivariational inequalities
involving a set-valued operator and a nonlinear term. Optim Lett. doi:10.1007/s11590-014-
0739-5
Xiao YB, Huang NJ (2008) Generalized quasi-variational-like hemivariational inequalities. Non-
liear Anal 69:637–646
Xiao YB, Huang NJ (2009) Sub-super-solution method for a class of higher order evolution
hemivariational inequalities. Nonliear Anal 71:558–570
Xiao YB, Yang XM, Huang NJ (2014) Some equivalence results for well-posedness of hemivaria-
tional inequalities. J Glob Optim. doi:10.1007/s10898-014-0198-7
Zhang YL, He YR (2011) On stably quasimonotone hemivariational inequalities. Nonlinear Anal
74:3324–3332
Chapter 5
An Iterative Algorithm for Split Common
Fixed-Point Problem for Demicontractive
Mappings

Yazheng Dang, Fanwen Meng, and Jie Sun

Abstract Inspired by the inertial proximal algorithms for finding a zero of a


maximal monotone operator, we propose an inertial iteration algorithm for solving
the split common fixed point problem for demicontractive mappings. We prove
the asymptotical convergence of the algorithm under certain mild conditions.
The results extend the result of Dang and Gao (Inverse Probl, 27:015007, 2011)
and Moudafi (Inverse Probl 26:055007, 6pp, 2010. doi:10.1088/0266-5611/26/5/
055007).

Keywords Split common fixed point problem • Inertial technique • Demicontrac-


tive mapping • Asymptotical convergence

5.1 Introduction

Consider the convex feasibility problem (CFP) (Chinneck 2004), which is to find a
common point in the intersection of finitely many convex sets. CFP has extensive
applications in many areas such as approximation theory (Deutsch 1992), image
reconstruction from projections (Censor 1998; Herman 1980), optimal control (Gao
2009), and so on. A popular approach to the CFP is the so-called projection

Y. Dang ()
College of Computer Science and Technology (Software College), Henan Polytechnic University,
454000, Jiaozuo, People’s Republic of China
School of Management, University of Shanghai for Science and Technology, 200093, Shanghai,
People’s Republic of China.
e-mail: jgdyz@163.com
F. Meng
National Healthcare Group, Singapore City, Singapore
e-mail: fanwen_meng@nhg.com.sg
J. Sun
Department of Mathematics and Statistics, Curtin University, 6102, Bentley, WA, Australia
e-mail: jie.sun@curtin.edu.au

© Springer-Verlag Berlin Heidelberg 2015 85


H. Xu et al. (eds.), Optimization Methods, Theory and Applications,
DOI 10.1007/978-3-662-47044-2_5
86 Y. Dang et al.

algorithm which employs orthogonal projection onto a set, see Bauschke and
Borwein (1996). An important special case of CFP is the split feasibility problem
(SFP), which deals with the case of finding a point in both the domain and the range
of a given linear operator. Namely, SFP is to find a point x satisfying

x 2 C; Ax 2 Q; (5.1)

where C and Q are nonempty convex subsets in H1 and H2 , respectively, and


A W H1 ! H2 is a linear operator, and H1 ; H2 are real Banach spaces. The SFP
was originally introduced in Censor and Elfving (1994) and can be applied to
image reconstruction, signal processing, and radiation therapy, for examples. Many
projection methods have been developed for solving the SFP, see Byrne (2004),
Censor et al. (2005), Dang and Gao (2011), Qu and Xiu (2008, 2005), and Yang
(2004). In Byrne (2002), Byrne introduced the so-called CQ algorithm, which takes
an arbitrary initial point x0 and computes the iterative step as:

xkC1 D PC Œ.I  AT .I  PQ /A/.xk /; (5.2)

where PC denotes the usual orthogonal projection onto C; that is, PC .x/ D
arg miny2C kx  yk, for any x 2 C; 0 < < 2=.AT A/, and .AT A/ is the spectral
radius of AT A.
Another algorithm, the KM algorithm, was proposed initially for solving fixed
point problem (Crombez 2005), Byrne (2004) first applied KM iteration to the
CQ algorithm for solving the SFP. Subsequently, Zhao and Yang (2005) applied
KM iteration to a perturbed CQ algorithm, Dang and Gao (2011) combined the
KM iterative method with the modified CQ algorithm to construct a KM-CQ-Like
algorithm for solving the SFP. All these algorithms only use current iteration to find
the next iteration, so they tend to have slow convergence in practice.
The problem of finding a zero of a maximal monotone operator G in Euclidean
space RN is

Find x 2 RN such that 0 2 Gx:

One of the fundamental approaches to solving it is the proximal method, which


generates the next iteration xkC1 by solving the subproblem

02 k G.x/ C .x  xk /; (5.3)

where xk is the current iteration and k is a regularization parameter. In 2001,


Attouch and Alvarez applied the inertial technique to the above algorithm (5.3)
to obtain an inertial proximal method for solving the problem of finding zero of
a maximal monotone operator. It works as follows. Given xk1 ; xk 2 RN and two
parameters k 2 Œ0; 1/; k > 0; find xkC1 2 RN such that

02 k G.x
kC1
/ C xkC1  xk  k .xk  xk1 /: (5.4)

Here, the inertia is induced by the term k .xk  xk1 /.


5 An Iterative Algorithm for Split Common Fixed-Point Problem for. . . 87

It is well known that the proximal iteration (5.3) may be interpreted as an implicit
one-step discretization method for the evolution differential inclusion
dx
02 .t/ C G.x.t// a:e: t  0; (5.5)
dt
where a.e. stands for almost everywhere. While the inspiration for (5.4) comes from
the implicit discretization of the differential system of the second-order in time,
namely

d2 x dx
02 2
.t/ C  .t/ C G.x.t// a:e: t  0; (5.6)
dt dt
where  > 0 is a damping or a friction parameter. It gives rise to various numerical
methods (for monotone inclusions and fixed problems) related to the inertial
terminology (first introduced in Alvarez and Attouch 2001), all these methods,
as (5.4), achieve nice convergence properties (Alvarez 2000, 2004; Alvarez and
Attouch 2001; Mainge 2007, 2008) by incorporating second order information.
Inspired by the inertial proximal point algorithm for finding zeros of a maximal
monotone operator, in this paper, we apply the inertial technique to the algorithm
presented by Moudafi in 2010 to propose an inertial iterative algorithm to solve
the split common fixed-point problem for demicontractive mappings. Under some
suitable conditions, the asymptotical convergence is proved.
The paper is organized as follows. In Sect. 5.2, we recall some preliminaries.
In Sect. 5.3, we present an inertial iterative algorithm and show its convergence.
Section 5.4 summarizes the paper by making some concluding remarks.

5.2 Preliminaries

Throughout the rest of the paper, I denotes the identity operator, Fix.T/ denotes the
set of the fixed points of an operator T i.e., Fix.T/ WD fx j x D T.x/g:
An operator T W H ! H is called demicontractive (see for example Maruster and
Popirlan 2008) if there exists a constant ˇ 2 Œ0; 1/ such that

kTx  zk2  kx  zk2 C ˇkx  Txk2 ; 8.x; z/ 2 H  Fix.T/; (5.7)

which is equivalent to

1ˇ
hx  Tx; x  zi  kx  Txk2 ; 8.x; z/ 2 H  Fix.T/ (5.8)
2
and
1Cˇ
hx  T.x/; z  T.x/i  kx  T.x/k2 ; 8.x; z/ 2 H  Fix.T/: (5.9)
2
88 Y. Dang et al.

An operator T W H ! H is called
(i) nonexpansive if kTx  Tyk  kx  yk for all .x; y/ 2 H  HI
(ii) quasi-nonexpansive if kTx  zk  kx  zk for all .x; z/ 2 H  Fix.T/I
(iii) strictly pseudocontractive if

kTx  Tyk2  kx  yk2 C ˇkx  y  .Tx  Ty/k2 for all .x; y/


2 H  H . for some ˇ 2 Œ0; 1/ /:

Let us also recall that T is called demi-closed at the origin, if for any sequence
fxk g  H and x 2 H, we have

xk ! x weakly and .I  T/.xk / ! 0 strongly ) x 2 Fix.T/:

In the following, an operator satisfying (5.7) will be called ˇ-demicontractive


mapping. Obviously, the class of demicontractive mappings contains quasi-
nonexpansive mappings and strictly pseudocontractive mappings with fixed points.
It is well known that the nonexpansive operators are demi-closed, which are both
quasi-nonexpansive and strictly pseodocontractive mappings.
The following Lemmas are important for the convergence analysis in the next
section.
Lemma 5.1. Let T˛ WD .1  ˛/I C ˛T, where ˛ 2 .0; 1, T is a ˇ-demicontractive
self-mapping on H with Fix.T/ ¤ ;. Then,

kT˛ x  zk2  kx  zk2  ˛.1  ˇ  ˛/kTx  xk2 : (5.10)

Proof. For any arbitrary element .x; q/ 2 H  Fix.T/, we have

kT˛ x  zk2 D kx  zk2  2˛hx  z; x  Txi C ˛ 2 kTx  xk2 ;

which, according to (5.8), yields

kT˛ x  zk2  kx  zk2  ˛.1  ˇ  ˛/kTx  xk2 :

From Lemma 5.1, it is easy to see that T˛ is quasi-nonexpansive if ˛ 2


Œ0; 1  ˇI Fix.T/ D Fix.T˛ / if Fix.T/ ¤ ;. Hence, Fix.T/ is then a closed
convex subset of H.
Lemma 5.2 (Mainge 2008). Assume 'k 2 Œ0; 1/ and ık 2 Œ0; 1/ satisfy:
(1) P'kC1  'k  k .'k  'k1 / C ık ;
C1
(2) kD1 ık < 1;
(3) fk g  Œ0; ; where  2 Œ0; 1/:
PC1
Then, the sequencef'kg is convergent with kD1 Œ'kC1  'k C < 1; where ŒtC WD
maxft; 0g for any t 2 R:
5 An Iterative Algorithm for Split Common Fixed-Point Problem for. . . 89

5.3 The Inertial Algorithm and Its Asymptotic Convergence

In what follows, we will focus our attention on the following general two-operator
split common fixed-point problem (SCFP):

find x 2 C such that Ax 2 Q: (5.11)

where A W H1 ! H2 is a bounded linear operator, U W H1 ! H1 and T W H2 ! H2


are two demicontractive operators with nonempty fixed-point sets Fix.U/ D C and
Fix.T/ D Q. Denote the solution set of the two-operator SCFP by

 D fy 2 C j Ay 2 Qg:

5.3.1 The Inertial Algorithm

Now, we give a description of the inertial algorithm and then present its asymptotic
convergence.

Algorithm 5.1
Initialization: Let x0 2 H1 be arbitrary.
Iterative step: For k 2 N, set u D I C AT .T  I/A, and let

yk D xk C k .xk  xk1 /

xkC1 D .1  ˛k /u.yk / C ˛k U.u.yk //;

1
where ˛k 2 .0; 1/ and 2 .0; / with being the spectral radius of the operator AT A; k 2
Œ0; 1/.

5.3.2 Asymptotic Convergence of the Inertial Algorithm

In this subsection, we establish the asymptotic convergence of Algorithm 5.3.1.


Lemma 5.3 (Opial 1967). Let H be a Hilbert space and let fxk g be a sequence in
H such that there exists a nonempty set S  H satisfying:
(1) For every x , limk kxk  x k exists.
(2) Any weak cluster point of the sequence fxk g belongs to S. Then, there exists
z 2 S such that fxk g weakly converges to z.
Theorem 5.1. Given a bounded linear operator A W H1 ! H2 , let U W H1 ! H1
be ˇ-demicontractive operator with nonempty Fix.U/ D C, T W H2 ! H2 be -
demicontractive operator with nonempty Fix.T/ D Q. Assume that U  I and T  I
90 Y. Dang et al.

are demiclosed at 0. If  ¤ ;, then any sequence fxk g generated by Algorithm 5.3.1


weakly converges to a split common fixed point x 2 , provided that we choose
parameter k satisfying k 2 Œ0; Nk  with Nk WD minf; 1=.kkxk xk1 k/2 g;  2 Œ0; 1/,
2 .0; 1 / and ˛k 2 .ı; 1  ˇ  ı/ for a small enough ı > 0.
Proof. Taking z 2 , using (5.10), we obtain

kxkC1  zk2 D k.1  ˛k /u.yk / C ˛k U.u.yk //  zk2

 ku.yk /  zk2  ˛k .1  ˇ  ˛k /kU.u.yk //  u.yk /k2 : (5.12)

On the other hand, we have

ku.yk /  zk2 D kyk C AT .T  I/.Ayk /  zk2


D kyk  zk2 C 2
kAT .T  I/.Ayk /k2
C2 hyk  z; AT .T  I/.Ayk /i
 kyk  zk2 C 2
k.T  I/.Ayk /k2
C2 hAyk  Az; .T  I/.Ayk /i;

that is,

ku.yk /zk2  kyk zk2 C 2


k.T I/.Ayk /k2 C2 hAyk Az; .T I/.Ayk /i: (5.13)

Setting  WD 2 hAyk  Az; .T  I/.Ayk /i, from (5.9), we get

 D 2 hAyk  Az; .T  I/.Ayk /i


D 2 hAyk  Az C .T  I/.Ayk /  .T  I/.Ayk /; .T  I/.Ayk /i
D 2 .hAyk  Az; .T  I/.Ayk /i  k.T  I/.Ayk /k2 /
1C
2 . k.T  I/.Ayk /k2  k.T  I/.Ayk /k2 /
2
D  .1  /k.T  I/.Ayk /k2 :

Combining the inequality above with (5.11), (5.13), it yields that

kxkC1  zk2  kyk  zk2

 .1   /k.T  I/.Ayk /k2  ˛k .1  ˇ  ˛k /kU.u.yk //  u.yk /k2 : (5.14)


1
Define the auxiliary real sequence 'k WD 2
kxk  zk2 . Therefore, from (5.14), we
have
5 An Iterative Algorithm for Split Common Fixed-Point Problem for. . . 91

1 k
'kC1  ky  zk2
2
1 1
 .1   /k.T  I/.Ayk /k2  ˛k .1  ˇ  ˛k /kU.u.yk //  u.yk /k2 : (5.15)
2 2
Then we have
1 k 1
ky  zk2 D kxk C k .xk  xk1 /  zk2
2 2
1 k 2
D kx  zk2 C k hxk  z; xk  xk1 i C k kxk  xk1 k2
2 2
k2 k
D 'k C k hxk  z; xk  xk1 i C kx  xk1 k2 :
2

It is easy to verify that 'k D 'k1 C hxk  z; xk  xk1 i  12 kxk  xk1 k2 . Hence

1 k k C k2 k
ky  zk2 D 'k C k .'k  'k1 / C kx  xk1 k2 : (5.16)
2 2
Putting (5.16) into (5.14), we get

k C k2 k
'kC1  'k C k .'k  'k1 / C kx  xk1 k2
2
1 1
 .1   /k.T  I/.Ayk /k2  ˛k .1  ˇ  ˛k /kU.u.yk //  u.yk /k2
2 2
By the assumption on ˛k , we have

k C k2 k
'kC1  'k C k .'k  'k1 / C kx  xk1 k2
2
1 1
 .1   /k.T  I/.Ayk /k2  ı 2 kU.u.yk //  u.yk /k2 : (5.17)
2 2
1
Since 2 .0; 
/ and k2  k , from (5.17), we derive

'kC1  'k C k .'k  'k1 / C k kxk  xk1 k2 : (5.18)

From the assumption on k , we have

1
k kxk  xk1 k2  ;
k2
92 Y. Dang et al.

and
C1
X
k kxk  xk1 k2 < 1: (5.19)
kD1

Let ık WD k kxk  xk1 k2 in Lemma 5.2. We deduce that the sequence fkxk  zkg
PC1
is convergent (hence fx g is bounded) with kD1 Œkxk  zk2  kxk1  zk2 C < 1.
k

From (5.17), we have

1
.1   /k.T  I/.Ayk /k2  'k  'kC1 C k .'k  'k1 / C k kxk  xk1 k2 ;
2
and
1 2
ı kU.u.yk //  u.yk /k2  'k  'kC1 C k .'k  'k1 / C k kxk  xk1 k2 :
2
Hence,

C1
X 1
.1   /k.T  I/.Ayk /k2 < 1
kD1
2

and
C1
X
ı 2 kU.u.yk //  u.yk /k2 < 1:
kD1

Therefore,

k.T  I/.Ayk /k2 ! 0 (5.20)

and

kU.u.yk //  u.yk /k2 ! 0: (5.21)

Suppose that x is a weak-cluster point of fxk g, let fxk g be a subsequence of fxk g.


Obviously,

w  lim yk D w  lim xk D x : (5.22)

Then, from (5.20) and the demiclosedness of T  I at 0, we obtain

T.Ax / D Ax ; (5.23)

from which it follows that Ax 2 Q:


5 An Iterative Algorithm for Split Common Fixed-Point Problem for. . . 93

Now, setting uk D yk C AT .T  I/.Ayk /; it follows that w  lim uk D x . By


the demiclosedness of U  I at 0, it follows from (5.21) that

U.x / D x : (5.24)

Hence x 2 C, and therefore x 2 : By using Lemma 5.3 with S D , we obtain


the weak convergence of the whole sequence fxk g.

5.4 Concluding Remarks

The paper developed an inertial algorithm and proved its weak convergence for
solving the split common fixed-point problem for demicontractive mappings in
Hilbert space. To some extent, the proposed algorithm and obtained results are
extensions of corresponding work in Dang and Gao (2011) and Moudafi (2010).
The inertial technique paves the way for investigating more effective and feasible
algorithm for the split common fixed-point problem. The strong convergence of the
algorithm is a possible future research topic.

Acknowledgements This work was partially supported by National Science Foundation of China
(under grant No.11171221), Basic and Frontier Research Program of Science and Technology
Department of Henan Province (under grants No.112300410277 and No.082300440150), China
Coal Industry Association Scientific and Technical Guidance to Project (under grant MTKJ-2011-
403), the NSTIP strategic technologies program in the Kingdom of Saudi Arabia – Award No.
(11-MAT1916-02), and research grant 71901 from Faculty of Science and Engineering, Curtin
University.

References

Alvarez F (2000) On the minizing property of a second order dissipative dynamical system in
Hilbert spaces. SIAM J Control Optim 39:1102–1119
Alvarez F, Attouch H (2001) An inertial proximal method for maximal monotone operators via
Discretization of a nonlinear oscillator with damping. Set-Valued Anal 9:3–11
Alvarez F (2004) Weak convergence of a relaxed and inertial hybrid projection-proximal point
algorithm for maximal monotone operators in Hilbert space. SIAM J Optim 3:773–782
Bauschke HH, Borwein JM (1996) On projection algorithms for solving convex feasibility
problems. SIAM Rev 38:367–426
Byrne C (2002) Iterative oblique projection onto convex sets and the split feasibility problem.
Inverse Probl 18:441–453
Byrne C (2004) An unified treatment of some iterative algorithm algorithms in signal processing
and image reconstruction. Inverse Probl 20:103–120
Chinneck JW (2004) The constraint consensus method for finding approximately feasible points
in nonlinear programs. INFORMS J Comput 16:255–265
Censor Y (1998) Parallel application of block iterative methods in medical imaging and radiation
therapy. Math Progr 42:307–325
94 Y. Dang et al.

Censor Y, Elfving T (1994) A multiprojection algorithm using Bregman projections in a product


space. Numer Algorithms 8:221–239
Censor Y, Elfving T, Kopf N, Bortfeld T (2005) The multiple-sets solit feasibility problem and its
applications for inverse problems. Inverse Probl 21:2071–2084
Crombez G (2005) A geometrical look at iterative methods for operators with fixed points. Numer
Funct Anal Optim 26:137–175
Deutsch F (1992) The method of alternating orthogonal projections. In: Sampat Pal S (ed) Approx-
imation theory, spline functions and applications. Kluwer Academic, Dordrecht, pp 105–121
Dang Y, Gao Y (2011) The strong convergence of a KM-CQ-Like algorithm for split feasibility
problem. Inverse Problems 27:015007
Gao Y (2009) Determining the viability for a affine nonlinear control system (in Chinese). J Control
Theory Appl 29:654–656
Herman GT (1980) Image reconstruction from projections: the fundamentals of computerized
tomography. Academic, New York
Moudafi A (2010) The split common fixed-poiny problem for demicontractive mappings. Inverse
Probl 26:055007 (6pp). doi:10.1088/0266-5611/26/5/055007
Maruster S, Popirlan C (2008) On the Mann-type iteration and convex feasibility problem. J
Comput Appl Math 212:390–396
Mainge PE (2007) Inertial iterative process for fixed points of certain quasi-nonexpansive
mappings, Set-valued Analysis 15:67–79
Mainge PE (2008) Convergence theorem for inertial KM-type algorithms. J Comput Appl Math
219:223–236
Opial Z (1967) Weak convergence of the sequence of successive approximations for nonexpansive
mappings. Bull Am Math Soc 73:591–597
Qu B, Xiu N (2008) A new halfspace-relaxation projection method for the split feasibility problem.
Linear Algebra Appl 428:1218–1229
Qu B, Xiu N (2005) A note on the CQ algotithm for the split feasibility problem. Inverse Probl
21:1655–1665
Yang Q (2004) The relaxed CQ algorithm solving the split feasibility problem. Inverse Probl
20:1261–1266
Zhao J, Yang Q (2005) Several solution methods for the split feasibility problem. Inverse Probl
21:1791–1799
Chapter 6
On Constraint Qualifications for Multiobjective
Optimization Problems with Vanishing
Constraints

S.K. Mishra, Vinay Singh, Vivek Laha, and R.N. Mohapatra

Abstract In this chapter, we consider a class of multiobjective optimization


problems with inequality, equality and vanishing constraints. For the scalar case, this
class of problems reduces to the class of mathematical programs with vanishing con-
straints recently appeared in literature. We show that under fairly mild assumptions
some constraint qualifications like Cottle constraint qualification, Slater constraint
qualification, Mangasarian-Fromovitz constraint qualification, linear independence
constraint qualification, linear objective constraint qualification and linear constraint
qualification do not hold at an efficient solution, whereas the standard generalized
Guignard constraint qualification is sometimes satisfied. We introduce suitable
modifications of above mentioned constraint qualifications, establish relationships
among them and derive the Karush-Kuhn-Tucker type necessary optimality condi-
tions for efficiency.

Keywords Constraint qualifications • Multiobjective optimization problems •


Vanishing constraints • Efficiency • Optimality conditions

S.K. Mishra
Department of Mathematics, Banaras Hindu University, Varanasi-221005, India
e-mail: bhu.skmishra@gmail.com
V. Singh
Department of Mathematics, National Institute of Technology, Chaltlang,
Izawal-796012, Mizoram, India
e-mail: vinaybhu1981@gmail.com
V. Laha ()
Department of Mathematics, Faculty of Science, Banaras Hindu University,
Varanasi-221005, India
e-mail: laha.vivek333@gmail.com
R.N. Mohapatra
Department of Mathematics, University of Central Florida, 4000 Central Florida Blvd.,
Orlando, FL 32816, USA
e-mail: ram.mohapatra@ucf.edu

© Springer-Verlag Berlin Heidelberg 2015 95


H. Xu et al. (eds.), Optimization Methods, Theory and Applications,
DOI 10.1007/978-3-662-47044-2_6
96 S.K. Mishra et al.

6.1 Introduction

In the multiobjective optimization problems, the constraint qualifications play an


important role for the existence of Lagrange multipliers so that the Karush-Kuhn-
Tucker (KKT) necessary optimality conditions hold, which in turn are important
to design various optimization algorithms. The constraint qualifications are the
restrictions imposed on the constraints in order to remove the degenerate cases from
the problem (see, e.g. Abadie 1967; Guignard 1969; Mangasarian 1969; Gould and
Tolle 1971; Peterson 1973). Maeda (1994) introduced generalized Guignard type
constraint qualifications in the differentiable multiobjective optimization problems
with inequality constraints and derived the Kuhn-Tucker type necessary optimality
conditions for efficiency ensuring the existence of positive Lagrange multipliers.
Later using the results of Maeda (1994) many authors have derived necessary opti-
mality conditions and duality results for efficiency in multiobjective optimization
problems both for smooth and nonsmooth cases (see, e.g., Bigi and Pappalardo
1999; Preda and Chitescu 1999; Li 2000; Aghezzaf and Hachimi 2001, 2004; Liang
et al. 2003; Maeda 2004; Mishra et al. 2005). We refer to Chinchuluun and Pardalos
(2007) and the references therein for more details in the field of multiobjective
optimization problems.
Recently, Achtziger and Kanzow (2008) introduced a special class of optimiza-
tion problems known as the mathematical programs with vanishing constraints
(MPVC). It was described in Achtziger and Kanzow (2008) that the MPVCs are
closely related to the class of mathematical programs with equilibrium constraints
(MPECs) (see, e.g. Luo et al. 1996; Outrarata et al. 1998; Facchinei and Pang 2003)
and a MPVC can always be reformulated as an MPEC. But, this reformulation
increases the dimension of the problem and involves a non-uniqueness of the solu-
tion. Moreover, studying MPVC as a MPEC does not take into account the special
structure of the MPVC. Hence, it is worth studying the properties of the MPVC.
We refer to Hoheisel et al. (2007, 2010), Hoheisel and Kanzow (2008, 2009), and
Izmailov and Solodov (2009) for more details related to MPVC literature.
It was also described in Achtziger and Kanzow (2008) that many problems
from structural topology optimization can be reformulated as a MPVC and thus the
complexities, nonlinearities, and singularities of the realistic stress constraints can
be incorporated into the mathematical problem formulations. Since, in the optimal
design of structures, one has to consider several conflicting design objectives
simultaneously, multiobjective optimization methodology must be applied within
the frame work of structural topology optimization. Stadler (1984) introduced the
field of multiobjective optimization problems in mechanics and later it was used as
a tool to solve various engineering problems including structural design problems
(see, e.g., Eschenauer et al. 1990; Koski 1993; Min et al. 2000; Lin et al. 2011).
The above mentioned works in the fields of multiobjective optimization problems
and mathematical programs with vanishing constraints are the main motivations
of this chapter. In this chapter, we study the class of multiobjective optimization
problems with vanishing constraints (MOPVC) and provide suitable modifications
of several known constraint qualifications like Guignard constraint qualification,
6 Multiobjective Optimization Problems with Vanishing Constraints 97

Abadie constraint qualification, Cottle constraint qualification, Slater constraint


qualification, linear objective constraint qualification, Mangasarian–Fromovitz con-
straint qualification, linear independence constraint qualification and linear con-
straint qualification for the MOPVC to establish necessary Karush-Kuhn-Tucker
type optimality conditions for efficiency.
The outline of this chapter is as follows: in Sect. 6.2, we give some known
definitions and results which will be used in the sequel. In Sect. 6.3, we discuss
the standard GGCQ at an efficient solution of the MOPVC and derive KKT type
necessary optimality conditions. In Sect. 6.4, we observe that some constraint
qualifications do not hold under fairly mild assumptions at an efficient solution of
the MOPVC, and hence we modify them to serve as sufficient conditions for the
standard GGCQ to hold. In Sect. 6.5, we give some suitable modifications of some
more constraint qualifications, like GGCQ, GACQ, ACQ, to establish a weaker
KKT type necessary optimality condition for a feasible solution to be an efficient
solution of the MOPVC, and establish relationships among them. In Sect. 6.6, we
conclude the results of this chapter and discuss some future research work.

6.2 Preliminaries

Consider the following multiobjective optimization problem (MOP):

min fQ .x/ WD .fQ1 .x/; : : : ; fQm .x//


s:t: gQ i .x/  0; 8i D 1; 2; : : : ; pQ ; (6.1)
hQ i .x/ D 0; 8i D 1; 2; : : : ; qQ ;

where all the functions ; fQi ; gQi ; hQi W Rn ! R are continuously differentiable. The
feasible set of the MOP (6.1) is given by

XQ W fx 2 Rn W gQi .x/  0.i D 1; 2; : : : ; pQ /; hQi .x/ D 0.i D 1; 2; : : : ; qQ /g: (6.2)

Solving the MOP (6.1) is to find a local efficient solution or an efficient solution
which are defined as follows:
Definition 6.1. Let x 2 XQ be a feasible solution of the MOP (6.1). Then, x is said
to be a local efficient solution
T of the MOP (6.1), iff there exists a number ı > 0 such
that, there is no x 2 XQ B .x I ı/ satisfying

fQi .x/  fQi .x /; 8i D 1; : : : ; m;


Q
fQi .x/ < fQi .x /; at least one i;

where B .x I ı/ denotes the open ball of radius ı and centre x :


98 S.K. Mishra et al.

Definition 6.2. Let x 2 XQ be a feasible solution of the MOP (6.1). Then, x is said
to be an efficient solution of the MOP (6.1), iff there is no x 2 XQ satisfying

fQi .x/  fQi .x /; 8i D 1; : : : ; m;


Q
fQi .x/ < fQi .x /; at least one i:

The following concept of tangent cones is well known in optimization (see, e.g.
Rockafellar 1970; Bajara et al. 1974; Clarke 1983).
Definition 6.3. Let QQ be a nonempty subset of Rn : The tangent cone to Q
Q at x 2
Q Q 
clQ is the set T QI x defined by

 
Q ftn g # 0 W xn ! x and x  x ! d ;
n
Q x / WD d 2 Rn j9fxn g
T.QI Q;
tn

Q denotes the closure of Q:


where clQ Q

The tangent cone T.QI Q x / is a nonempty closed cone and if Q Q is convex, then the
Q   Q
cone T.QI x / is also convex. Let x 2 X be a feasible solution to the MOP (6.1),
and suppose that IQf ; IgQ and IhQ are the set of indices given by

IQf WD f1; 2; : : : ; mg
Q ;
IgQ WD fi 2 f1; 2; : : : ; pQ g jgQi .x / D 0g ; (6.3)
IhQ WD f1; 2; : : : ; qQ g :

For each k D 1; : : : ; m; Q k and Q


Q the nonempty sets Q Q are given as follows

Q k WD fx 2 Rn j gQ i .x/  0; 8i D 1; 2; : : : ; pQ ;
Q
hQ i .x/ D 0; 8i D 1; 2; : : : ; qQ ; (6.4)
fQi .x/  fQi .x /; 8i D 1; 2; : : : ; m;
Q and i ¤ kg;

and

Q WD fx 2 Rn j gQ i .x/  0; 8i D 1; 2; : : : ; pQ ;
Q
hQ i .x/ D 0; 8i D 1; 2; : : : ; qQ ; (6.5)
fQi .x/  fQi .x /; 8i D 1; 2; : : : ; mg:

Q

For scalar objective optimization problems, Q Q 1 D X:


Q
The following concept of an approximating cone to the set Q Q was introduced in
Maeda (1994) for a multiobjective optimization problem with inequality constraints,
and is of significant importance for the subsequent analysis.
6 Multiobjective Optimization Problems with Vanishing Constraints 99

Q at x 2 Q
Definition 6.4. The linearizing cone to Q Q is the set L QI
Q x given by

Q x WD fd 2 Rn j r gQ i .x /T d  0; 8i 2 IgQ ;
L QI
r hQ i .x /T d D 0; 8i 2 IhQ ;
r fQi .x /T d  0; 8i 2 IQf g:

The following constraint qualification was considered in Maeda (1994) for a


multiobjective optimization problem with inequality constraints as a generalization
of the Guignard constraint qualification appeared in Guignard (1969).
Definition 6.5. Let x 2 XQ be an efficient solution of the MOP (6.1). Then, the
generalized Guignard constraint qualification (GGCQ) holds at x iff

Q
\
m
Q x
L QI Q k I x ;
clcoT Q
kD1

Q k I x denotes the closure of the convex hull of T Q


where clcoT Q Q k I x :

The following constraint qualifications are sufficient conditions for the GGCQ to
hold at an efficient solution of the MOP (6.1).
Definition 6.6. Let x 2 XQ be an efficient solution of the MOP (6.1). Then,
(a) The Abadie constraint qualification (ACQ) holds at x iff

Q x
L QI Q x I
T QI

(b) The generalized Abadie constraint qualification (GACQ) holds at x iff

Q
\
m
Q x
L QI Q k I x I
T Q
kD1

(c) The Cottle constraint qualification (CCQ) holds at x iff for each k D
1; 2; : : : ; m;
Q the system

r fQi .x /T d < 0; 8i 2 IQf ; i ¤ k;

r gQ i .x /T d < 0; 8i 2 IgQ ;


r hQ i .x /T d D 0; 8i 2 IhQ ;

has a solution d 2 Rn I
100 S.K. Mishra et al.

(d) The Slater constraint qualification (SCQ) holds at x ; iff the objective functions
and the inequality constraints

fQi .i D 1; 2; : : : ; m/
Q ;
gQi .i D 1; 2; : : : ; pQ /

are all convex on Rn , the equality constraints

hQi .i D 1; 2; : : : ; qQ /

are all affine on Rn ; and for each k D 1; 2; : : : ; m;


Q the system

fQi .x/ < fQi .x /; 8i D 1; 2; : : : ; m;


Q and i ¤ k;
gQ i .x/ < 0; 8i D 1; 2; : : : ; pQ ;
hQ i .x/ D 0; 8i D 1; 2; : : : ; qQ

has a solution x 2 Rn :
(e) The linear constraint qualification (LCQ) holds at x iff the objective functions
fQi .i D 1; 2; : : : ; m/;
Q the inequality constraints gQi .i D 1; 2; : : : ; pQ /; and the
equality constraints hQi .i D 1; 2; : : : ; qQ /; are all affine.
(f) The linear objective constraint qualification (LOCQ) holds at x ; iff the
objective functions fQi .i D 1; 2; : : : ; m/;
Q are all affine, and the system

r fQi .x /T d  0; 8i 2 IQf ;

r gQ i .x /T d < 0; 8i 2 IgQ ;


r hQ i .x /T d D 0; 8i 2 IhQ ;

has a solution d 2 Rn I
(g) The Mangasarian-Fromovitz constraint qualification (MFCQ) holds at x ; iff
the gradients
 
r fQi .x / i 2 IQf

r hQ i .x / i 2 IhQ

are linearly independent, and the system

r fQi .x /T d D 0; 8i 2 IQf ;

r gQ i .x /T d < 0; 8i 2 IgQ ;


6 Multiobjective Optimization Problems with Vanishing Constraints 101

r hQ i .x /T d D 0; 8i 2 IhQ ;

has a solution d 2 Rn :
(h) The linear independence constraint qualification (LICQ) holds at x ; iff the
gradients
 
r fQi .x / i 2 IQf ;

r gQ i .x / i 2 IgQ ;
r hQ i .x / i 2 IhQ

are linearly independent.


The relationships among above mentioned constraint qualifications are given in
the following Fig. 6.1.
It is clear that GGCQ is the weakest among all the constraint qualifications, and
when it is satisfied the KKT type necessary optimality conditions for efficiency was
given in Maeda (1994, Theorem 3.2) as follows:
Theorem 6.1. Let x 2 XQ be an efficient solution of the MOP (6.1), such
that the GGCQ holds at x . Then, there exist Lagrange multipliers Qi 2
R .i D 1; 2; : : : ; m/
Q ; Q i 2 R.i D 1; 2; : : : ; pQ /; Qi 2 R.i D 1; 2; : : : ; qQ /; such
that the following first order optimality conditions hold

Q pQ qQ
X
m X X
Qi r fQi .x / C Q i r gQi .x / C Qi r hQi .x / D 0; (6.6)
iD1 iD1 iD1

and

Qi > 0; 8i D 1; 2; : : : ; m;
Q Q i  0; Q i gQi .x / D 0; 8i D 1; 2; : : : :; pQ : (6.7)

Fig. 6.1 Relationships


among constraint
qualifications of MOP
102 S.K. Mishra et al.

6.3 Constraint Qualifications in Multiobjective Optimization


Problems with Vanishing Constraints

We consider a constrained multiobjective optimization problem as follows:

min f .x/ WD .f1 .x/; : : : ; fm .x//


s:t: gi .x/  0; 8i D 1; 2; : : : ; p;
hi .x/ D 0; 8i D 1; 2; : : : ; q; (6.8)
Hi .x/  0; 8i D 1; 2; : : : ; r;
Gi .x/Hi .x/  0; 8i D 1; 2; : : : ; r;

where all the functions fi ; gi ; hi ; Hi ; Gi W Rn ! R are assumed to be continuously


differentiable. The problem (6.8) is called as a multiobjective optimization problem
with vanishing constraints (MOPVC). For the scalar case the MOPVC (6.8)
reduces to a special class of optimization problems known as the mathematical
programs with vanishing constraints (MPVC), which was introduced in Achtziger
and Kanzow (2008), and further studied in Hoheisel et al. (2007, 2010), Hoheisel
and Kanzow (2008, 2009), and Izmailov and Solodov (2009).
The class of MOPVCs can be interrelated with the class of multiobjective
optimization problems with equilibrium constraints (MOPEC), see, Mordukhovich
(2004, 2006, 2009), Bao and Mordukhovich (2007), and Bao et al. (2007, 2008)
and the references therein for more details. By introducing slack variables si ; i D
1; 2; : : : ; r; the MOPVC (6.8) is equivalent to the following MOPEC in the variables
z WD .x; s/ W

min f .x/ WD .f1 .x/; : : : ; fm .x//


x;s

s:t: gi .x/  0; 8i D 1; 2; : : : ; p;
hi .x/ D 0; 8i D 1; 2; : : : ; q;
Gi .x/  si  0; 8i D 1; 2; : : : ; r;
Hi .x/  0; 8i D 1; 2; : : : ; r;
si  0; 8i D 1; 2; : : : ; r;
Hi .x/si D 0; 8i D 1; 2; : : : ; r:

The above reformulation of the MOPVC (6.8) as a MOPEC is always possible,


but it increases the dimension of the problem and involves a non-uniqueness of the
solution. Moreover, studying MOPVC as a MOPEC does not take into account the
special structure of the MOPVC. Hence, it is worth studying the properties of the
MOPVC directly.
6 Multiobjective Optimization Problems with Vanishing Constraints 103

In this section, we discuss the GGCQ for the MOPVC (6.8) under which the
Karush-Kuhn-Tucker (KKT) type necessary optimality conditions for a feasible
solution to be an efficient solution will be given. Suppose that the set X defined
by

X WD fx 2 Rn W gi .x/  0; 8i D 1; 2; : : : ; p;
hi .x/ D 0; 8i D 1; 2; : : : ; q;
Hi .x/  0; 8i D 1; 2; : : : ; r;
Gi .x/Hi .x/  0; 8i D 1; 2; : : : ; rg

is the feasible set of the MOPVC (6.8), and x 2 X is an efficient solution. The
index sets x are defined as follows

If WD f1; 2; : : : ; mg;
Ig WD fi 2 f1; 2; : : : ; pgjgi .x / D 0g;
Ih WD f1; 2; : : : ; qg; (6.9)
IC WD fi 2 f1; 2; : : : ; rgjHi .x / > 0g;
I0 WD fi 2 f1; 2; : : : ; rgjHi .x / D 0g:

The index set IC .x / can be further divided into the following subsets

IC0 WD fi 2 f1; 2; : : : ; rgjHi .x / > 0; Gi .x / D 0g;


IC WD fi 2 f1; 2; : : : ; rgjHi .x / > 0; Gi .x / < 0g: (6.10)

Similarly, partitioning the index set I0 .x/ can be done as follows

I0C WD fi 2 f1; 2; : : : ; rgjHi .x / D 0; Gi .x / > 0g;


I00 WD fi 2 f1; 2; : : : ; rgjHi .x / D 0; Gi .x / D 0g; (6.11)
I0 WD fi 2 f1; 2; : : : ; rgjHi .x / D 0; Gi .x / < 0g:

Also, consider the following function

i .x/ WD Gi .x/Hi .x/; 8i D 1; 2; : : : ; r (6.12)

and the gradient is given by

ri .x/ D Gi .x/rHi .x/ C Hi .x/rGi .x/; 8i D 1; 2; : : : ; r: (6.13)

The definition of index sets (6.9)–(6.11) provides the following


104 S.K. Mishra et al.

8
ˆ
<0;
ˆ if i 2 I00 ;

ri .x / D Gi .x /rHi .x /; if
 
i 2 I0C [ I0 ; (6.14)
ˆ
:̂H .x /rG .x /; if i 2 IC0 :
i i

For each k D 1; 2; : : : ; m; the nonempty sets Qk and Q are defined as follows

Qk WD fx 2 Rn j gi .x/  0; 8i D 1; 2; : : : ; p;
hi .x/ D 0; 8i D 1; 2; : : : ; q;
Hi .x/  0; 8i D 1; 2; : : : ; r; (6.15)
Gi .x/Hi .x/  0; 8i D 1; 2; : : : ; r
fi .x/  fi .x /; 8i D 1; 2; : : : ; m; i ¤ kg;

and

Q WD fx 2 Rn j gi .x/  0; 8i D 1; 2; : : : ; p;
hi .x/ D 0; 8i D 1; 2; : : : ; q;
Hi .x/  0; 8i D 1; 2; : : : ; r; (6.16)
Gi .x/Hi .x/  0; 8i D 1; 2; : : : ; r;
fi .x/  fi .x /; 8i D 1; 2; : : : ; mg:

The following result gives the standard linearizing cone to Qk ; k D 1; 2; : : : ; m;


at an efficient solution x 2 X of the MOPVC (6.8).
Lemma 6.1. Let x 2 X be an efficient solution of the MOPVC (6.8). Then, the
linearizing cone to Qk ; k D 1; 2; : : : ; m; at x is given by

L.Qk I x / D fd 2 Rn j rfi .x /T d  0; 8i 2 If ; i ¤ k;


rgi .x /T d  0; 8i 2 Ig ;
rhi .x /T d D 0; 8i 2 Ih ; (6.17)
 T
rHi .x / d D 0; 8i 2 I0C ;
rHi .x /T d  0; 8i 2 I0 [ I00 ;
rGi .x /T d  0; 8i 2 IC0 g:

Proof. Suppose that i ; i D 1; 2; : : : ; r; is the function from (6.12). Then, using the
definitions of the index sets from (6.9)–(6.11), and in view of Definition 6.4, the
linearizing cone to Qk ; k D 1; : : : ; m at x 2 Qk is given by
6 Multiobjective Optimization Problems with Vanishing Constraints 105

L.Qk I x / D fd 2 Rn j rfi .x /T d  0; 8i 2 If ; i ¤ k;


rgi .x /T d  0; 8i 2 Ig ;
rhi .x /T d D 0; 8i 2 Ih ;
rHi .x /T d  0; 8i 2 I0 ;
ri .x /T d  0; 8i 2 I0 [ IC0 :

Now, using the expression of the ri .x / for i 2 I0 [ IC0 from (6.14), and
rearranging the terms involved, we get the required representation (6.17) for the
linearizing cone to Qk ; k D 1; : : : ; m at x 2 Qk : t
u
Remark 6.1. The linearizing cone to Qk ; k D 1; : : : ; m at x 2 Qk is a nonempty,
closed and convex cone in Rn :
Moreover, for the scalar case it reduces to the linearized cone of the MPVC
given in Achtziger and Kanzow (2008, Lemma 4). Also, it is clear by the expres-
sions (6.15)–(6.17), that L .QI x / D \m k 
kD1 L Q I x : Alternatively, the expression

for L .QI x / follows immediately from Achtziger and Kanzow (2008, Lemma 4),
by viewing it as a linearized cone to an MPVC with the constraints

gi .x/  0; 8i D 1; 2; : : : ; p;
fi .x/  fi .x /  0; 8i D 1; 2; : : : ; m;
hi .x/ D 0; 8i D 1; 2; : : : ; q;
Hi .x/  0; 8i D 1; 2; : : : ; r;
Gi .x/Hi .x/  0; 8i D 1; 2; : : : ; r;

The following result gives the KKT type necessary optimality conditions for effi-
ciency, when the standard GGCQ holds at an efficient solution of the MOPVC (6.8).
Theorem 6.2. Let x 2 X be an efficient solution of the MOPVC (6.8). If
the standard GGCQ holds at x ; then there exist Lagrange multipliers i 2
R i 2 If ; i 2 R .i D 1; : : : ; p/ ; i 2 R .i 2 Ih / ; H
i ; i 2 R .i D 1; : : : ; r/ ; such
G

that

X
m X
p
X
q
 
i rfi .x /C i rgi .x / C i rhi .x /
iD1 iD1 iD1

X
r X
r
 
 H
i rHi .x / C G
i rGi .x / D 0; (6.18)
iD1 iD1
106 S.K. Mishra et al.

and

i > 0; 8i D 1; 2; : : : ; m;
gi .x /  0; i  0; i gi .x

/ D 0; 8i D 1; 2; : : : ; p;
hi .x / D 0; 8i D 1; 2; : : : ; q; (6.19)
H
i D 0 .i 2 IC / ; H
i  0 .i 2 I00 [ I0 / ;

i free .i 2 I0C / ; i Hi .x / D 0; 8i D 1; 2; : : : ; r;
H H


i D 0 .i 2 I0 [ IC / ; i  0; .i 2 IC0 / ; i Gi .x / D 0; 8i D 1; 2; : : : ; r:
G G G

Proof. Suppose that x 2 X is an efficient solution of the MOPVC (6.8) such


that GGCQ holds at x : Then, by Theorem 6.1, there exists Lagrange multipliers
C 
i 2 R .i D 1; : : : ; m/ ; i 2 R .i D 1; : : : ; p/ ; i ; i 2 R .i D 1; : : : ; q/ ; ˛i 2
R .i D 1; : : : ; r/ ; ˇi 2 R .i D 1; : : : ; r/ ;
such that the following conditions hold:

X
m X
p
X
q
X
q
 
i rfi .x /C i rgi .x / C iC rhi .x /  i rhi .x /
iD1 iD1 iD1 iD1

X
r X
r
 ˛i rHi .x / C ˇi ri .x / D 0; (6.20)
iD1 iD1

and

i > 0; 8i D 1; 2; : : : ; m;
gi .x /  0; i  0; i gi .x

/ D 0; 8i D 1; 2; : : : ; p;
hi .x /  0; iC  0; iC hi .x / D 0; 8i D 1; 2; : : : ; q;
hi .x /  0; i  0; i hi .x / D 0; 8i D 1; 2; : : : ; q; (6.21)
H.x /  0; ˛i  0; ˛i Hi .x / D 0; 8i D 1; 2; : : : ; r;
i .x /  0; ˇi  0; ˇi i .x / D 0; 8i D 1; 2; : : : ; r;

where i ; i D 1; 2; : : : ; r denotes the function from (6.12). Now, using the


representation (6.13) of the gradient of i ; and setting

iC  i WD i ; 8i D 1; 2; : : : ; q;
˛i  ˇi G.x / WD H
i ; 8i D 1; 2; : : : ; r; (6.22)

ˇi Hi .x / WD G
i ; 8i D 1; 2; : : : ; r;

we get the required KKT type necessary optimality conditions (6.18) and (6.19).
t
u
6 Multiobjective Optimization Problems with Vanishing Constraints 107

Fig. 6.2 The feasible region


of Example 6.1

For the scalar case, the above KKT type necessary optimality conditions for the
MOPVC (6.8) under GGCQ reduces to the KKT conditions for the MPVC under
the standard Abadie constraint qualification given in Achtziger and Kanzow (2008,
Theorem 1). The next corollary is a direct consequence of the fact that the tangent
cones T Qk I x ; k D 1; 2; : : : ; m; contain the origin 0 2 Rn :
Corollary 6.1. Let x 2 X be an efficient solution of the MOPVC (6.8) such
that L .QI x / D f0g: Then, there exists Lagrange multipliers satisfying (6.18)
and (6.19).
Now, we give an example which verifies Corollary 6.1 with I00 ¤ :
Example 6.1. Consider the following MOPVC given by

min f .x1 ; x2 / WD x1 C x2 ; x1 C x22 ;


s:t: H1 .x/ WD x31 C x2  0;
G1 .x/H1 .x/ WD x2 x31 C x2  0;

which is a MOPVC of the form (6.8) with n D 2; m D 2; p D q D 0 and r D 1: It


is easy to see that the origin x WD .0; 0/ 2 R2 is a feasible solution of the MOPVC
and I00 D f1g: Also, x WD .0; 0/ 2 R2 is an efficient solution of the MOPVC over
the feasible region given by Fig. 6.2. Using Lemma 6.1, one has
˚
L QI x D .0; 0/ 2 R2 :

Now, for any Lagrange multipliers 1  0; 2  0 (not all zero) H


1  0 and 1 ;
G

one has
   
.0; 0/ D 1 rf1 .x /C 2 rf2 .x /  H
1 rH1 .x / C 1 rG1 .x /
G

D 1  2; 1 1 C 1 ;
 H G

and

1  0; 1 H1 .x / D 0:
H H
108 S.K. Mishra et al.

Fig. 6.3 The objective


functions of Example 6.1

Fig. 6.4 The feasible region


of Example 6.2

Thus 1 D 0 implies 2 D 0; and vice versa. Hence, we have 1 > 0 and 2 > 0;
and Corollary 6.1 is satisfied (Fig. 6.3).
Now, we give an example in which GGCQ does not hold for the MOPVC (6.8) with
I00 ¤ 
Example 6.2. Consider the following MOPVC given by

min f .x1 ; x2 / WD x1 ; x1 C x22 ;


s:t: H1 .x/ WD 1  x21  x22  0;
G1 .x/H1 .x/ WD x2 1  x21  x22  0;

which is MOPVC (6.8) with n D 2; m D 2; p D q D 0; and r D 1: It is easy to


see that the point x WD .1; 0/ is a feasible solution of the MOPVC and I00 D f1g:
Also, x is an efficient solution of the MOPVC over the feasible region given by
Fig. 6.4. Using Lemma 6.1 and the definitions of tangent cones T Qk I x ; k D 1; 2;
one has
6 Multiobjective Optimization Problems with Vanishing Constraints 109

Fig. 6.5 The objective


functions of Example 6.2

˚
L QI x D d 2 R2 jd1 D 0 ;
˚
T Q 1 I x D d 2 R2 j2d1  d12  d22  0; d2 2d1  d12  d22  0; d1 Cd22  0 ;
˚
T Q2 I x  D d 2 R2 j2d1  d12  d22  0; d2 2d1  d12  d22  0; d1  0 :

Observe that
2
\
L QI x 6 clcoT Qk I x ;
kD1

and hence the standard GGCQ does not hold at x : Now, for any Lagrange
multipliers 1  0; 2  0; not both zero, and H
1  0; 1  0; one has
G

   
.0; 0/ D 1 rf1 .x /C 2 rf2 .x /  H
1 rH1 .x / C 1 rG1 .x /
G

D 1  2 1 ; 1 ;
 2H G

which does not satisfy the necessary optimality conditions (6.18) and (6.19).
Examples 6.1 and 6.2 show that GGCQ is not always violated when I00 ¤ ; but
it may not hold sometimes when I00 ¤ : Following example shows that GGCQ is
not a sufficient condition for the existence of positive Lagrange multipliers for the
MOPVC (6.8) (Fig. 6.5).
Example 6.3. Consider the following MOPVC given by

min f .x1 ; x2 / WD x1 ; x21 C x2 ;


s:t: H1 .x/ WD x1 C x2  0;
G1 .x/H1 .x/ WD x2 .x1 C x2 /  0;
110 S.K. Mishra et al.

Fig. 6.6 Feasible region of


the Example 6.3

Fig. 6.7 Objective functions


of Example 6.3

which is MOPVC (6.8) with n D 2; m D 2; p D q D 0 and r D 1: It is easy to


see that origin x WD .0; 0/ is feasible solution of the MOPVC and I00 D f1g : Also,
x is an efficient solution of the MOPVC over the feasible region given by Fig. 6.6.
Using Lemma 6.1, one has L .QI x / D f0g:
Now, for any Lagrange multipliers 1  0; 2  0; not both zero, and H 1 
0; G
1  0; one has

   
.0; 0/ D 1 rf1 .x /C 2 rf2 .x /  H
1 rH1 .x / C 1 rG1 .x /
G

D 1 1 ; 1 C 1 :
 H H G

1 D 1  0; 1 D 0 implies
Observe that for H G
2  0; which violets the existence
of positive Lagrange multipliers (Fig. 6.7).
6 Multiobjective Optimization Problems with Vanishing Constraints 111

6.4 Sufficient Conditions for the Generalized Guignard


Constraint Qualification

It was shown in Achtziger and Kanzow (2008) that some constraint qualifications
like LICQ and MFCQ do not hold under fairly mild assumptions, whereas some
constraint qualifications like ACQ may not hold sometimes at a local minimum
of the MPVC. In this section, we investigate some more constraint qualifications
like CCQ, SCQ, LOCQ and LCQ for the MOPVC (6.8) and modify them where
necessary to use them as sufficient conditions for the GGCQ to hold at an efficient
solution of the MOPVC (6.8). The next result shows that under fairly reasonable
assumptions CCQ is not satisfied at an efficient solution of the MOPVC (6.8).
Lemma 6.2. Let x 2 X be an efficient solution of the MOPVC (6.8) with I00 [
I0C ¤ : Then, the standard CCQ is not satisfied at x :
Proof. Suppose that CCQ is satisfied at x : Then, for each k D 1; : : : ; m; the system

rfi .x /T d < 0; 8i 2 If ; i ¤ k;


rgi .x /T d < 0; 8i 2 Ig ;
rhi .x /T d D 0; 8i 2 Ih ; (6.23)
 T
rHi .x / d > 0; i 2 I0C [ I00 [ I0 ;
i .x /T d < 0; i 2 I0C [ I00 [ I0 [ IC0 ;

has a solution d 2 Rn : Using the gradient of i from (6.14) in (6.23), one has

0 D ri .x / < 0; 8i 2 I00 ;

and
1
rHi .x /T d D ri .x /d < 0; 8i 2 I0C ;
Gi .x /

a contradiction, and hence CCQ is not satisfied at x : t


u
As a direct consequence of Lemma 6.2, and in view of Maeda (1994, Lemmas 4.3),
we obtain the following result, which is multiobjective analog of Achtziger and
Kanzow (2008, Lemma 3).
Corollary 6.2. Let x be an efficient solution of the MOPVC (6.8) with I00 [ I0C ¤
: Then, the standard MFCQ is not satisfied at x :
We also obtain the following corollary as a direct consequence of Lemma 6.2 in
view of Maeda (1994, Lemmas 4.4).
112 S.K. Mishra et al.

Corollary 6.3. Let x be an efficient solution of the MOPVC (6.8) with I00 [ I0C ¤
: Then, the standard SCQ is not satisfied at x :
Since LICQ implies MFCQ, the following result is a direct consequence of
Corollary 6.2, and is a multiobjective analog of Achtziger and Kanzow (2008,
Lemma 2).
Corollary 6.4. Let x be an efficient solution of the MOPVC (6.8) with I00 [ I0C ¤
: Then, the standard LICQ is not satisfied at x : Moreover, if I0 ¤ ; then also the
standard LICQ is not satisfied at x :
The proof of the following result is similar to the proof of Lemma 6.2, and it
shows that LOCQ is also not satisfied at an efficient solution of the MOPVC (6.8)
when I00 [ I0C ¤ :
Lemma 6.3. Let x be an efficient solution of the MOPVC (6.8) with I00 [I0C ¤ :
Then, the standard LOCQ is not satisfied at x :
The above results show that under fairly mild assumptions most of the constraint
qualifications are violated at an efficient solution of the MOPVC (6.8), and hence
we introduce some constraint qualifications as modifications of the standard CCQ,
MFCQ, SCQ, LICQ and LOCQ for the MOPVC (6.8).
Definition 6.7. Let x be an efficient solution of the MOPVC (6.8). Then the
Cottle-Type constraint qualification for the MOPVC (6.8), denoted by CCQ-
MOPVC, holds at x iff for each k D 1; : : : ; m; the system

rfi .x /T d < 0; 8i 2 If ; i ¤ k;


rgi .x /T d < 0; 8i 2 Ig ;
rhi .x /T d D 0; 8i 2 Ih ; (6.24)
 T
rHi .x / d D 0; 8i 2 I0C [ I00 ;
rHi .x /T d > 0; 8i 2 I0 ;
rGi .x /T d < 0; 8i 2 IC0 ;

has a solution d 2 Rn :
It is clear that CCQ-MOPVC is different from the standard CCQ for the
MOPVC (6.8), and is a fair assumption. We now show that CCQ-MOPVC is a
sufficient condition for the GGCQ provided that the critical index set I00 D :
Lemma 6.4. Let x be an efficient solution of the MOPVC (6.8) with I00 D : If
CCQ-MOPVC is satisfied at x then the standard GGCQ is also satisfied at x :
Proof. Suppose that x is an efficient solution of the MOPVC (6.8) with I00 D :
Then, the MOPVC (6.8) is locally equivalent to the following MOP:
6 Multiobjective Optimization Problems with Vanishing Constraints 113

min f .x/ WD .f1 .x/; : : : ; fm .x//


s:t: gi .x/  0; 8i 2 Ig ;
hi .x/ D 0; 8i 2 Ih ;
Hi .x/ D 0; 8i 2 I0C ; (6.25)
Hi .x/  0; 8i 2 I0 ;
Gi .x/  0; 8i 2 IC0 :

Now, when CCQ-MOPVC is satisfied at x with I00 D ; then the standard CCQ
for the MOP (6.25) will also hold at x ; and hence the standard GGCQ for the
MOP (6.25) will also be satisfied at x ; that is,

  \
m  
O x
L QI clcoT QO k I x ;
kD1

O k ; k D 1; : : : ; m; and Q
where Q O are defined as

O k WD fx 2 Rn j fi .x/  fi .x /; 8i 2 If ; i ¤ k;
Q
gi .x/  0; 8i 2 Ig ;
hi .x/ D 0; 8i 2 Ih ; (6.26)
Hi .x/ D 0; 8i 2 I0C ;
Hi .x/  0; 8i 2 I0 ;
Gi .x/  0; 8i 2 IC0 g;

and

O WD fx 2 Rn j fi .x/  fi .x /; 8i 2 If ;
Q
gi .x/  0; 8i 2 Ig ;
hi .x/ D 0; 8i 2 Ih ; (6.27)
Hi .x/ D 0; 8i 2 I0C ;
Hi .x/  0; 8i 2 I0 ;
Gi .x/  0; 8i 2 IC0 g:
 
O x is given by
Also, the linearizing cone L QI
114 S.K. Mishra et al.

 
O x D fd 2 Rn W rfi .x /T d  0; 8i 2 If ;
L QI

rgi .x /T d  0; 8i 2 Ig ;
rhi .x /T d D 0; 8i 2 Ih ; (6.28)
 T
rHi .x / d D 0; 8i 2 I0C ;
rHi .x /T  0; 8i 2 I0 ;
rGi .x /T  0; 8i 2 IC0 g:


 x / with
which in view of Lemma 6.1 is nothing but the linearizing cone L .QI
O
I00 D : Now, since Q O I x
Q , k D 1; 2; : : : ; m; it follows that T Q
k k

T Qk I x ; k D 1; 2; : : : ; m; which implies that

\
m   \
m
clcoT QO k I x clcoT Qk I x ;
kD1 kD1

and hence the standard GGCQ holds at x : t


u
Now, we give an example which verifies that CCQ-MOPVC may not imply GGCQ
if I00 ¤ :
Example 6.4. Consider the following MOPVC given by

min f .x1 ; x2 / WD x1 C x22 ; x21 C x2 ;


s:t: H1 .x/ WD x1 C x2  0;
G1 .x/H1 .x/ WD x1 .x1 C x2 /  0;

which is a MOPVC (6.8) with n D 2; m D 2; p D q D 0 and r D 1: It is easy to see


that the origin x WD .0; 0/ is an efficient solution of the MOPVC over the feasible
region given by Fig. 6.8 and I00 D f1g: Using Lemma 6.1, and the definition of the
tangent cones T Qk I x ; k D 1; 2; one has
˚
L QI x D d 2 R2 jd1  0; d2  0; d1 C d2  0 I
˚
T Q1 I x D d 2 R2 jd1 C d2  0; d1 .d1 C d2 /  0; d12 C d2  0 I
˚
T Q2 I x D d 2 R2 jd1 C d2  0; d1 .d1 C d2 /  0; d1 C d22  0 :

Observe that
2
\
L QI x 6 clcoT Qk I x ;
kD1
6 Multiobjective Optimization Problems with Vanishing Constraints 115

Fig. 6.8 The feasible region


of Example 6.4

Fig. 6.9 The objective


functions of Example 6.4

and hence the GGCQ-MOPVC is not satisfied at x for the given MOPVC. But, the
system given by (6.24) is solvable for x ; and hence CCQ-MOPVC holds at x for
the given MOPVC (Fig. 6.9).
Now, we give a constraint qualification for the MOPVC (6.8), which is a modified
version of the standard MFCQ, and is a multiobjective analog of VC-MFCQ
introduced in Achtziger and Kanzow (2008).
Definition 6.8. Let x 2 X be an efficient solution of the MOPVC (6.8). Then,
the Mangasarian-Fromovitz constraint qualification (MFCQ) for the MOPVC (6.8),
denoted by MFCQ-MOPVC, holds at x iff the gradients
116 S.K. Mishra et al.

rfi x i 2 If ;
rhi x .i 2 Ih / ;
rHi x .i 2 I00 [ I0C / ;

are linearly independent, and the system

rfi x
T
d D 0; 8i 2 If ;
 T
rgi x d < 0; 8i 2 Ig ;

rhi x
T
d D 0; 8i 2 Ih ; (6.29)

rHi x
T
d D 0; 8i 2 I0C [ I00 ;

rHi x
T
d > 0; 8i 2 I0 ;

rGi x
T
d < 0; 8i 2 IC0 ;

has a solution d 2 Rn :
The following result gives the relationship between the CCQ-MOPVC and the
MFCQ-MOPVC.
Lemma 6.5. Let x 2 X be an efficient solution of the MOPVC (6.8). If MFCQ-
MOPVC holds at x 2 X; then CCQ-MOPVC also holds at x :
Proof. Suppose that the MFCQ-MOPVC holds at x ; but the CCQ-MOPVC does
not hold at x : Then, there exists k 2 f1; : : : ; mg such that the system (6.24) has no
solution d 2 Rn : By Motzkin’s theorem of the alternative Mangasarian (1969), there
exist real numbers i  0 i 2 If ; i ¤ k ; i  0 i 2 Ig ; H  0 .i 2 I0 / ; G
i S i 
0 .i 2 IC0 / W not all zero, and i 2 R .i 2 Ih / ; 
ei 2 R .i 2 I0C I00 / ; such that
H

X
m X X
 
i rfi .x /C i rgi .x /C i rhi .x /
iD1 i2Ig i2Ih
i¤k
X X X
 ei H rHi .x / 
 H 
i rHi .x / C G 
i rGi .x / D 0: (6.30)
S
i2I00 I0C i2I0 i2IC0

Suppose that d 2 Rn solves the systems (6.29), then from (6.30), one has
X X X
 T  T  T
i rgi .x / d H
i rHi .x / d C G
i rGi .x / d D 0:
i2Ig i2I0 i2IC0
6 Multiobjective Optimization Problems with Vanishing Constraints 117

Using (6.29) the above equation implies that

i D 0; 8i 2 Ig ;
H
i D 0; 8i 2 I0 ;

i D 0; 8i 2 IC0 :
G

Substituting the values in (6.30), one has

X
m X X

i rfi .x /C i rhi .x /  ei H rHi .x / D 0:

S
iD1 i2Ih i2I00 I0C
i¤k

Since

rfi x i 2 If ;

rhi x .i 2 Ih / ;
rHi x .i 2 I00 [ I0C /

are all linearly independent, one has

i D 0; 8i 2 If ; i ¤ k;
i D 0; 8i 2 Ih ;
ei H D 0; 8i 2 I00 [ I0C ;


a contradiction to the existence of not all zero Lagrange multipliers, and hence the
result. t
u
The following constraint qualification is a modification of the standard SCQ for
the MOPVC (6.8) and serves as a sufficient condition for the CCQ-MOPVC to hold.
Definition 6.9. Let x 2 X be an efficient solution of the MOPVC (6.8). Then,
the Slater-type Constraint Qualification for the MOPVC (6.8), denoted by SCQ-
MOPVC, holds at x 2 X; iff the functions

fi i 2 If ;
gi i 2 Ig ;
Gi .i 2 IC0 / ;

are all convex on Rn ;

Hi .i 2 I0 /
118 S.K. Mishra et al.

are all concave on Rn ; and

Hi .i 2 I00 [ I0C / ;
hi .i 2 Ih /

are all affine on Rn ; and for each k D 1; 2; : : : ; m the system

fi .x/ < fi x 8i 2 If ; i ¤ k;
gi .x/ < 0; 8i 2 Ig ;
hi .x/ D 0; 8i 2 Ih ; (6.31)
Hi .x/ D 0; 8i 2 I0C [ I00 ;
Hi .x/ > 0; 8i 2 I0 ;
Gi .x/ < 0; 8i 2 IC0 ;

has a solution x 2 Rn :
The following result gives the relationship between the CCQ-MOPVC and the
SCQ-MOPVC.
Lemma 6.6. Let x 2 X be an efficient solution of the MOPVC (6.8). If SCQ-
MOPVC holds at x ; then CCQ-MOPVC also holds at x :
Proof. Suppose that SCQ-MOPVC holds at x : Then, for each k D 1; : : : ; m; there
exists an xk 2 Rn such that

fi xk < fi x ; 8i 2 If ; i ¤ k;
gi xk < 0; 8i 2 Ig ;
hi xk D 0; 8i 2 Ih ; (6.32)
Hi xk D 0; 8i 2 I0C [ I00 ;
Hi xk > 0; 8i 2 I0 ;
Gi xk < 0; 8i 2 IC0 :

Since, the function fi i 2 If ; gi i 2 Ig ; Gi .i 2 IS C0 / are all convex on


Rn ; Hi .i 2 I0 / are all concave on Rn ; and Hi .i 2 I00 I0C / ; hi .i 2 Ih / are all
affine on Rn ; one has

rfi x xk  x  fi xk  fi x < 0; 8i 2 If ; i ¤ k;
T

rgi x xk  x  gi xk  gi x < 0; 8i 2 Ig ;
T

rhi x xk  x D hi xk  hi x D 0; 8i 2 Ih ;
T
(6.33)
6 Multiobjective Optimization Problems with Vanishing Constraints 119

rHi x xk  x D Hi xk  Hi x D 0; 8i 2 I0C [ I00 ;


T

rHi x xk  x  Hi xk  Hi x > 0; 8i 2 I0 ;


T

rGi x xk  x  Gi xk  Gi x < 0; 8i 2 IC0 ;


T

Setting xk  x WD d k ; (6.33) implies that the CCQ-MOPVC holds at x : t


u
The following constraint qualifications are modifications of LCQ and LOCQ for
the MOPVC (6.8).
Definition 6.10. Let x 2 X be an efficient solution of the MOPVC (6.8). Then, the
Linear Constraint Qualification for the MOPVC (6.8), denoted by LCQ-MOPVC,
holds at x ; iff the functions

fi i 2 If
gi i 2 Ig ;
hi .i 2 Ih / ;
Hi .i 2 I0 / ;
Hi .i 2 IC0 / ;

are all affine.


Definition 6.11. Let x 2 X be an efficient solution of the MOPVC (6.8). Then, the
Linear Objective Constraint Qualification for the MOPVC (6.8), denoted by LOCQ-
MOPVC, holds at x ; iff the functions

fi i 2 If

are all affine, and the system

rfi x
T
d  0; 8i 2 If ;

rgi x
T
d < 0; 8i 2 Ig ;

rhi x
T
d D 0; 8i 2 Ih ; (6.34)

rHi x
T
d D 0; 8i 2 I0C [ I00 ;
 T
rHi x d > 0; 8i 2 I0 ;

rGi x
T
d < 0; 8i 2 IC0 ;

has a solution d 2 Rn :
120 S.K. Mishra et al.

The following result gives the relationship between the LCQ-MOPVC and the
standard GGCQ.
Lemma 6.7. Let x 2 X be an efficient of the MOPVC (6.8) such that I00 D : if
LCQ-MOPVC holds at x ; then the standard GGCQ also holds at x :
Proof. Suppose that x 2 X is an efficient solution of the MOPVC (6.8) with I00 D
: Then, the MOPVC (6.8) is locally equivalent to the MOP (6.25). Hence, for
I00 D ; LCQ-MOPVC is identical to the standard LCQ of the MOP (6.25.) Since,
LCQ of the MOP (6.25) holds at x ; it follows that GGCQ of the MOP (6.25) also
holds at x ; and proceeding as in Lemma 6.4, we get the required result. t
u
The proof of the following result is similar to the proof of Lemma 6.7.
Lemma 6.8. Let x 2 X be an efficient solution of the MOPVC (6.8) such that
I00 D . If LOCQ-MOPVC holds at x ; then GGCQ also holds at x :
Now, we give a constraint qualification of the MOPVC (6.8), which serves as a
sufficient condition for the MFCQ-MOPVC to hold, and is a multiobjective analog
of VC-LICQ introduced in Achtziger and Kanzow (2008).
Definition 6.12. Let x 2 X be an efficient solution of the MOPVC (6.8). Then,
the linear independence constraint qualification of the MOPVC (6.8), denoted by
LICQ-MOPVC, holds at x ; iff for each k D 1; 2; : : : ; m; the gradients

rfi x i 2 If ; i ¤ k ;

rgi x i 2 Ig ;
rhi x .i 2 Ih / ;
rHi x .i 2 I0 / ;
rGi x .i 2 IC0 / ;

are linearly independent.


The next result is a direct consequence of Definitions 6.8 and 6.12.
Lemma 6.9. Let x 2 X be an efficient solution of the MOPVC (6.8). If LICQ-
MOPVC holds x ; then MFCQ-MOPVC also holds at x :
The summary of the above results are given in Fig. 6.10, and we have the
following theorem.
Theorem 6.3. Let x 2 X be an efficient solution of the MOPVC (6.8) with
I00 D : If any of the constraint qualifications given by Definitions 6.7–6.12 holds
at x then the standard GGCQ holds at x and there exist Lagrange multipliers
satisfying (6.18) and (6.19).
6 Multiobjective Optimization Problems with Vanishing Constraints 121

LOCQ-MOPVC

LCQ-MOPVC ACQ GACQ

MFCQ-MOPVC CCQ-MOPVC GGCQ

LICQ-MOPVC SCQ-MOPVC

Fig. 6.10 Relationships among modified constraint qualifications

6.5 A Modified Generalized Guignard Constraint


Qualification

It was observed in Sect. 6.3 that the standard GGCQ may or may not hold at an
efficient solution of the MOPVC (6.8) when I00 ¤ : In this section, we introduce
a suitable modification of the GGCQ of the MOPVC (6.8), and use it to prove
necessary optimality conditions for efficiency in MOPVC (6.8), that are different
from the standard KKT conditions given by Theorem 6.2. We also provide various
sufficient conditions for the modified GGCQ to hold.
In order to define a modified Guignard constraint qualification, we intro-
duce a nonlinear multiobjective optimization problem (NLMOP) derived from the
MOPVC (6.8) depending on an efficient solution x 2 X as follows

min f .x/ WD .f1 .x/; : : : ; fm .x//


s:t: gi .x/  0; 8i D 1; 2; : : : ; p;
hi .x/ D 0; 8i D 1; 2; : : : ; q; (6.35)
Hi .x/ D 0; Gi .x/  0; 8i 2 I0C ;
Hi .x/  0; Gi .x/  0; 8i 2 I0 [ I00 [ IC0 [ IC :
122 S.K. Mishra et al.

k
Also, define the sets Q and Q as follows
k
Q WD fx 2 Rn j fi .x/  fi x ; 8i D 1; 2; : : : ; m; i ¤ k;
gi .x/  0; 8i D 1; 2; : : : ; p;
hi .x/ D 0; 8i D 1; 2; : : : ; q; (6.36)
Hi .x/ D 0; Gi .x/  0; 8i 2 I0C ;
Hi .x/  0; Gi .x/  0; 8i 2 I0 [ I00 [ IC0 [ IC g

and

Q WD fx 2 Rn j fi .x/  fi x ; 8i D 1; 2; : : : ; m;
gi .x/  0; 8i D 1; 2; : : : ; p;
hi .x/ D 0; 8i D 1; 2; : : : ; q; (6.37)
Hi .x/ D 0; Gi .x/  0; 8i 2 I0C ;
Hi .x/  0; Gi .x/  0; 8i 2 I0 [ I00 [ IC0 [ IC g:
k
The linearizing cone Q at x 2 X is given by
k
L.Q I x / D fd 2 Rn j rfi .x /T d  0; 8i D 1; : : : ; m; i ¤ k;
rgi .x /T d  0; 8i 2 Ig ;
rhi .x /T d D 0; 8i 2 Ih ; (6.38)
rHi .x /T d D 0; 8i 2 I0C ;
rHi .x /T d  0; 8i 2 I00 [ I0 ;
rGi .x /T d  0; 8i 2 I00 [ IC0 g:

The linearizing cone to Q at x 2 Q given by

\
m  k 

L QI x D L Q I x : (6.39)
kD1

The
k
following
 lemma gives the relationship between the tangent cones
T Q I x ; k D 1; 2; : : : ; m; and the linearizing cone L.QI x /:

Lemma 6.10. Let x 2 be an efficient solution of the MOPVC (6.8). Then, we have

\
m  k 
clcoT Q I x L QI x :
kD1
6 Multiobjective Optimization Problems with Vanishing Constraints 123

Proof. Suppose that x 2 X is an efficient solution of the MOPVC (6.8). By Maeda


(1994, Lemma 3.1), we always have

\
m
clcoT Qk I x L QI x ; (6.40)
kD1

and

\
m  k  \
m  k 
clcoT Q I x L Q I x D L QI x : (6.41)
kD1 kD1

k
Also, since Q Qk , 8k D 1; : : : ; m; one has
 k 
T Q I x T Qk I x ; 8i D 1; : : : ; m; (6.42)

and
 k 
L Q I x L Qk I x ; 8i D 1; : : : ; m: (6.43)

Combining (6.40)–(6.43), we have

\
m  k 
clcoT Q I x L QI x ; (6.44)
kD1

and hence the result. t


u
In view of Lemma 6.10, we are now in a position to define some modified
constraint qualifications of the MOPVC (6.8).
Definition 6.13. Let x 2 X be an efficient solution of the MOPVC (6.8). Then, a
modified GGCQ of the NLMOP (6.35), denoted by GGCQ-NLMOP, is said to hold
at x ; iff

\
m  k 
L QI x clcoT Q I x :
kD1

Definition 6.14. Let x 2 X be an efficient solution of the MOPVC (6.8). Then, a


modified GGCQ of the MOPVC (6.8), denoted by GGCQ-MOPVC, is said to hold
at x ; iff

\
m
L QI x clcoT Qk I x :
kD1
124 S.K. Mishra et al.

The following result gives the relationship between the GGCQ-NLMOP and the
GGCQ-MOPVC.
Lemma 6.11. Let x 2 X be an efficient solution of the MOPVC (6.8). If the
GGCQ-NLMOP holds at x ; then the GGCQ-MOPVC also holds at x 2 X:
Proof. Suppose that x 2 X is an efficient solution of the MOPVC (6.8) such that
GGCQ-NLMOP holds at x ; then one has

\
m  k 
L QI x clcoT Q I x : (6.45)
kD1

k
 k 
Since, Q Qk ; 8k D 1; 2; : : : ; m; it follows that T Q I x
T Qk I x ; 8k D 1; 2; : : : ; m; and hence

\
m  k  \
m
clcoT Q I x clcoT Qk I x : (6.46)
kD1 kD1

 k 
Also, we always have L Q I x L Qk I x ; 8k D 1; : : : ; m; which follows
that

\
m  k  \
m
L Q I x L Qk I x  : (6.47)
kD1 kD1

Combining (6.45)–(6.47), one has

\
m
L QI x clcoT Qk I x ;
kD1

which implies that the GGCQ-MOPVC holds at x : t


u
Remark 6.2. Let x 2 X be an efficient solution of the MOPVC (6.8) such that the
standard GGCQ holds at x : Then, GGCQ-MOPVC is also satisfied at x ; since we
always have

\
m
 
L QI x L QI x clcoT Qk I x :
kD1

The following example says that GGCQ-MOPVC is strictly weaker than the
standard GGCQ.
6 Multiobjective Optimization Problems with Vanishing Constraints 125

Example 6.5. Consider the following MOPVC given by

min f .x1 ; x2 / WD .x1 C x2 ; x2  x1 / ;


s:t: H1 .x/ WD x1 C x2  0;
G1 .x/H1 .x/ WD x1 .x1 C x2 /  0;

which is a MOPVC (6.8) with n D 2; m D 2; p D q D 0 and r D 1: It is easy


to see that the origin x WD .0; 0/ is an efficient solution of the MOPVC over the
feasible region X given by
˚
X x 2 R2 W x1 C x2  0; x1 .x1 C x2 /  0 ;

and I00 D f1g: Also, the sets Q1 ; Q2 ; Q and Q are given by


˚
Q1 D x 2 R2 W x1 C x2  0; x1 .x1 C x2 /  0; x2  x1  0 ;
˚
Q2 D x 2 R2 W x1 C x2  0; x1 .x1 C x2 /  0; x1 C x2  0 ;
˚
Q D x 2 R2 W x1 C x2  0; x1 .x1 C x2 /  0; x1 C x2  0; x2  x1  0 ;
˚
Q D x 2 R2 W x1 C x2  0; x1  0; x1 C x2  0; x2  x1  0 :

It is clear that
˚
T Q1 ; x D d 2 R2 W d1 C d2  0; d1 .d1 C d2 /  0; d2  d1  0 ;
˚
T Q2 ; x D d 2 R2 W d1 C d2  0; d1 .d1 C d2 /  0; d1 C d2  0 ;
˚
L Q; x D d 2 R2 W d1 C d2  0; d2  d1  0; d1 C d2  0 ;
˚
L Q; x D d 2 R2 W d1 C d2  0; d2  d1  0; d1 C d2  0; d1  0 :

which implies that


2
\
L Q; x 6 clcoT Qk ; x ;
kD1

whereas
2
\
L Q; x clcoT Qk ; x ;
kD1

hence the GGCQ-MOPVC holds whereas the standard GGCQ is not satisfied at x :
When GGCQ-MOPVC holds at an efficient solution x 2 X of the MOPVC (6.8),
the KKT conditions of Theorem 6.2 may not hold, since GGCQ-MOPVC is weaker
than the standard GGCQ. Hence, in the following result, we derive KKT type
necessary optimality conditions for efficiency under GGCQ-MOPVC.
126 S.K. Mishra et al.

Theorem 6.4. Let x 2 X be an efficient solution of the MOPVC (6.8) such that
the GGCQ-MOPVC holds at x . Then, there exist Lagrange multipliers i 2 R.i D
1; : : : ; m/; i 2 R.i D 1; : : : ; p/; i 2 R.i D 1; : : : ; q/; H
i ; i 2 R.i D 1; : : : ; r/;
G

such that

X
m X
p
X
q
 
i rfi .x / C i rgi .x / C i rhi .x /
iD1 iD1 iD1

X
r X
r
 
 H
i rHi .x / C G
i rGi .x / D 0; (6.48)
iD1 iD1

and

i > 0; 8i D 1; 2; : : : ; m;
gi .x /  0; i  0; i gi .x

/ D 0; 8i D 1; 2; : : : ; p;
hi .x / D 0; 8i D 1; 2; : : : ; q; (6.49)
H
i D 0 .i 2 IC / ; H
i  0 .i 2 I00 [ I0 / ;

i free .i 2 I0C / ; i Hi .x / D 0; 8i D 1; 2; : : : ; r;
H H

i D 0 .i 2 I0C [ I0 [ IC / ; i  0; .i 2 I00 [ IC0 / ;


G G


i Gi .x / D 0; 8i D 1; 2; : : : ; r:
G

Proof. Suppose that x is an efficient solution of the MOPVC (6.8). We will first
show that the system

rfi x
T
d  0; 8i D 1; 2; : : : ; m;

rfi x
T
d < 0; at least one i;

rgi x
T
d  0; 8i 2 Ig ;
 T
rhi x d D 0; 8i 2 Ih ; (6.50)

rHi x
T
d D 0; 8i 2 I0C ;

rHi x
T
d  0; 8i 2 I00 [ I0 ;

rGi x
T
d  0; 8i 2 I00 [ IC0 ;

has no solution d 2 Rn : Suppose to the contrary that there exists d 2 Rn such that the
system (6.50) is solvable, then d 2 L QI x : Without loss of generality, we may
assume that
6 Multiobjective Optimization Problems with Vanishing Constraints 127

rfi x
T
d < 0; for some k; 1  k  m;
 T
rfi x d  0; 8i D 1; : : : ; m; i ¤ k:

Since, the GGCQ-MOPVC holds at x ; d 2 clcoT Qi I x ; 8i D 1; : : : ; m: In


particular, we have

d 2 clcoT Qk I x ; 1  k  m:
Hence, there exists a sequence fdl g coT Qk I x ; 1  k  m; such that
dl ! d: Now, for each dl ; l D 1; 2; : : : ; there exist number l ; lj  0; and
dlj 2 T Qk I x ; 1  k  m; j D 1; 2; : : : ; l ; such that
l l
X X
lj D 1; lj dlj D dl :
jD1 jD1

Since, dlj 2 T Qk I nx o; 1  k  m; for each l Dn 1;o2; : : : ; and j D 1; 2; : : : ; l ;


there exist sequences xslj Qk ; 1  k  m; and tljs # 0 such that xslj ! x and
xslj x xslj x
xslj ! dlj : Setting dljs WD xslj ; for all s D 1; 2; : : : ; we have

fi xslj D fi x C tljs dljs  fi x ; 8i D 1; : : : ; m; i ¤ k;


gi xslj D gi x C tljs dljs  0 D gi x ; 8i 2 Ig ;
hi xslj D hi x C tljs dljs D 0 D hi x ; 8i 2 Ih ; (6.51)
Hi xslj D Hi x C tljs dljs  0 D Hi x ; 8i 2 I0 [ I00 [ I0C ;
i xslj D i x C tljs dljs  0 D i x ; 8i 2 I0 [ I00 [ I0C [ IC0 :

Also, since x is an efficient solution of the MOPVC (6.8), for all s D 1; 2; : : : ; we


have
fk xslj D fk x C tljs dljs  fk x ; 1  k  m: (6.52)

From (6.51) and (6.52), we have

rfk x
T
dlj  0; 1  k  m;

rfi x
T
dlj  0; 8i D 1; : : : ; m; i ¤ k;

rgi x
T
dlj  0; 8i 2 Ig ;
 T
rhi x dlj D 0; 8i 2 Ih ;

rHi x
T
dlj  0; 8i 2 I0C [ I00 [ I0 ;

ri x
T
dlj  0; 8i 2 I0C [ I00 [ I0 [ IC0 :
128 S.K. Mishra et al.

By linearity and continuity of the inner product and using the expression of the
ri .x / from (6.14), it follows that

rfk x
T
d  0; 1  k  m;

rfi x
T
d  0; 8i D 1; : : : ; m; i ¤ k;

rgi x
T
d  0; 8i 2 Ig ;
 T
rhi x d D 0; 8i 2 Ih ;

rHi x
T
d D 0; 8i 2 I0C ;

rHi x
T
d  0; 8i 2 I00 [ I0 ;

rGi x
T
d  0; 8i 2 IC0 ;

which in view of Lemma 6.1 implies that rfk .x /T d D 0; 1  k  m; since


d 2 L QI x L .QI x / ; a contradiction to our assumption, and hence the
system (6.50) has no solution d 2 Rn : Now, by the Tucker’s theorem of the
alternative (Mangasarian 1969), there exist Lagrange multipliers

i 2 R .1; : : : ; m/ ; i 2 R i 2 Ig ; i 2 R .1; : : : ; q/ ;
H
i 2 R .i 2 I0C [ I00 [ I0 / ; i 2 R .i 2 I00 [ IC0 / ;
G

such that

X
m X X
q
 
i rfi .x / C i rgi .x / C i rhi .x /
iD1 i2Ig iD1
X X
 
 H
i rHi .x / C G
i rGi .x / D 0;
i2I0 i2I00 [IC0

and

i > 0; 8i D 1; 2; : : : ; m;
 0; 8i 2 Ig ;
i 2 R; 8i D 1; 2; : : : ; q;

i 2 R; 8i 2 I0C ; i  0; 8i 2 I00 [ I0 ;


H H

i  0; 8i 2 I00 [ IC0 :
G
6 Multiobjective Optimization Problems with Vanishing Constraints 129

Setting

H
i D 0; 8i 2 IC ;

i D 0; 8i 2 I0C [ I0 [ IC ;


G

the required necessary optimality conditions (6.48) and (6.49) follow. t


u
For m D 1; the above necessary optimality conditions will reduce into the VC-
KKT conditions of the MPVC (Achtziger and Kanzow 2008, Theorem 4) and are
called as the KKT-MOPVC conditions.
We now give some constraint qualifications which assure that the GGCQ-
MOPVC holds.
Definition 6.15. Let x 2 X be an efficient solution of the MOPVC (6.8). Then, a
modified GACQ of the MOPVC (6.8), denoted by GACQ-MOPVC, holds at x ; iff

\
m
L QI x T Qk I x :
kD1

Remark 6.3. As we always have L QI x L .QI x / ; the GACQ-MOPVC holds,


whenever the standard GACQ is satisfied, but the converse is not true in general.
Hence, the standard GACQ serves as a sufficient condition for the GACQ-MOPVC
to hold.
The following constraint qualification assures that the GACQ-MOPVC holds.
Definition 6.16. Let x 2 X be an efficient solution of the MOPVC (6.8). Then, a
modified GACQ of the NLMOP (6.35), denoted by GACQ-NLMOP, is said to hold
at x ; iff

\
m  k 
L QI x T Q I x :
kD1

The following result is a direct consequence of Definitions 6.14–6.16.


Lemma 6.12. Let x 2 X be an efficient solution of the MOPVC (6.8). If the GACQ-
MOPVC holds at x , then the GGCQ-MOPVC is satisfied. Moreover, if the GACQ-
NLMOP holds at x , then the GACQ-MOPVC and the standard GACQ both are
satisfied at x :
Now, we give some modifications of the standard ACQ which assure that the
GACQ-MOPVC holds. The following constraint qualification can be treated as
a multiobjective analog of the VC-Abadie constraint qualification introduced in
Achtziger and Kanzow (2008).
130 S.K. Mishra et al.

Definition 6.17. Let x 2 X be an efficient solution of the MOPVC (6.8). Then, a


modified ACQ of the MOPVC (6.8), denoted by ACQ-MOPVC, is said to hold at
x ; iff

L QI x T QI x :

The ACQ-MOPVC is weaker than the standard ACQ as we always have


L QI x L .QI x / :
Definition 6.18. Let x 2 X be an efficient solution of the MOPVC (6.8). Then, a
modified ACQ of the MOPVC (6.8), denoted by ACQ-NLMOP, holds at x ; iff

L QI x T QI x :

The following result is a direct consequence of Definitions 6.15, 6.17 and 6.18.
Lemma 6.13. Let x 2 X be an efficient solution of the MOPVC (6.8). If the ACQ-
MOPVC holds at x ; then the GACQ-MOPVC is also satisfied at x : Moreover, if
ACQ-NLMOP holds at x then the ACQ-MOPVC and the standard ACQ both are
satisfied at x :
We now give some more sufficient conditions which assure that the GGCQ-
MOPVC holds at an efficient solution of the MOPVC (6.8).
Theorem 6.5. Let x be an efficient solution of the MOPVC (6.8), and consider the
following conditions:
1. The standard GGCQ holds for the NLMOP (6.35) at x I
2. For each k D 1; : : : ; m; there exists a vector dO 2 Rn satisfying

rfi x dO < 0; 8i D 1; 2; : : : ; m; i ¤ k;
T

rgi x dO < 0; 8i 2 Ig ;
T

rhi x dO D 0; 8i 2 Ih ;
T
(6.53)

rHi x dO D 0; 8i 2 I0C ;
T

rHi x dO > 0; 8i 2 I00 [ I0 ;


T

rGi x dO < 0; 8i 2 I00 [ IC0 I


T

3. The functions fi .i D 1; : : : ; m/, gi .i D 1; : : : ; p/ ; Gi .i 2 I0 [ I00 [ IC0 [ IC /


are all convex, Gi .i 2 I0C / ; Hi .i 2 I0 [ I00 [ IC0 [ IC / are all concave and
hi .i D 1; : : : ; q/ ; Hi .i 2 I0C / are all affine, and for each k D 1; : : : ; m; there
exists a vector dO 2 Rn satisfying
6 Multiobjective Optimization Problems with Vanishing Constraints 131

fi .x/ < fi x ; 8i D 1; 2; : : : ; m; i ¤ k;
gi .x/ < 0; 8i D 1; 2; : : : ; p;
hi .x/ D 0; 8i D 1; 2; : : : ; q; (6.54)
Hi .x/ D 0; Gi .x/ > 0; 8i 2 I0C ;
Hi .x/ > 0; Gi .x/ < 0; 8i 2 I0 [ I00 [ IC0 [ IC I

4. The functions fi ; gi ; hi ; Gi ; Hi are all affine;


5. The functions fi .i D 1; : : : ; m/ are all affine, and there exists a vector dO 2 Rn
satisfying

rfi x dO  0; 8i D 1; 2; : : : ; m;
T

rgi x dO < 0; 8i 2 Ig ;
T

rhi x dO D 0; 8i 2 Ih ;
T
(6.55)

rHi x dO D 0; 8i 2 I0C ;
T

rHi x dO > 0; 8i 2 I00 [ I0 ;


T

rGi x dO < 0; 8i 2 I00 [ IC0 I


T

6. The gradients

rfi x .i D 1; : : : ; m/ ;
rhi x .i D 1; : : : ; q/ ;
rHi x .i 2 I0C / ;

are linearly independent, and that there exists a vector dO 2 Rn satisfying

rfi x dO D 0; 8i D 1; 2; : : : ; m;
T

rhi x dO D 0; 8i D 1; 2; : : : ; q;
T

rHi x dO D 0; 8i 2 I0C ;
T
(6.56)

rgi x dO < 0; 8i 2 Ig ;
T

rHi x dO > 0; 8i 2 I00 [ I0 ;


T

rGi x dO < 0; 8i 2 I00 [ IC0 I


T
132 S.K. Mishra et al.

7. The gradients

rfi x .i D 1; 2; : : : m/ ;
rgi x i 2 Ig ;
rhi x .i D 1; : : : q/ ;
rHi x .i 2 I0 / ;
rGi x .i 2 I00 [ IC0 / ;

are all linearly independent.


Then, GGCQ-MOPVC holds at x and there exist Lagrange multipliers satisfy-
ing (6.48) and (6.49).
Proof. (1) Since the standard GGCQ holds for the NLMOP (6.35), we have

\
m  k 
L QI x  clcoT Q I x :
kD1

k
 k 
Also, Q Qk ; k D 1; 2; : : : ; m; which implies that T Q I x T Qk I x  ; k D
1; 2; : : : ; m; and hence

\
m  k  \
m
clcoT Q I x clcoT Qk I x :
kD1 kD1

Combining above inclusions, we have

\
m

L QI x clcoT Qk I x ;
kD1

that is, the GGCQ-MOPVC is satisfied at x I


(2)–(7) the assumptions are the standard CCQ, SCQ, LCQ, LOCQ, MFCQ and
LICQ of the NLMOP (6.35), respectively, and hence the standard GGCQ holds for
the NLMOP (6.35) at x ; which implies by (1) above that the GGCQ-MOPVC is
satisfied at x :
Hence, by Theorem 6.4, there exist Lagrange multipliers satisfying (6.48)
and (6.49). t
u
The results of this section can be summarized in Fig. 6.11, and we have the following
theorem.
Theorem 6.6. Let x 2 X be an efficient solution of the MOPVC (6.8). If,
any of the constraint qualifications given by Definitions 6.13–6.18 holds at x ;
then the GGCQ-MOPVC also holds at x ; and there exist Lagrange multipliers
satisfying (6.48) and (6.49).
6 Multiobjective Optimization Problems with Vanishing Constraints 133

ACQ-NLMOP ACQ GACQ

ACQ-MOPVC GACQ-MOPVC GACQ-NLMOP

GGCQ-MOPVC

GGCQ GGCQ-NLMOP

Fig. 6.11 Sufficient conditions for the GGGCQ-MOPVC

6.6 Conclusions

In this chapter, we introduced a new class of multiobjective optimization problems


called the multiobjective optimization problems with vanishing constraints as an
extension of the mathematical programs with vanishing constraints from the scalar
case. We showed that under fairly mild assumptions some constraint qualifications
like Cottle constraint qualification, Slater constraint qualification, Mangasarian-
Fromovitz constraint qualification, linear independence constraint qualification,
linear objective constraint qualification and linear constraint qualification do not
hold at an efficient solution, whereas the standard generalized Guignard constraint
qualification is sometimes satisfied. We gave various constraint qualifications,
as modifications of the standard constraint qualifications, which assure that the
generalized Guignard constraint qualification holds at an efficient solution. We
also introduced a suitable modification of the generalized Guignard constraint
qualification, gave sufficient conditions which assure that it holds and derive
Karush-Kuhn-Tucker type necessary optimality conditions for efficiency.

Acknowledgements The authors are thankful to the anonymous referees for their valuable
comments and suggestions which helped to improve this chapter in its present form. This work
was done when Vinay Singh was a Post Doctoral Fellow of National Board of Higher Mathematics
(NBHM), Department of Atomic Energy (DAE), Government of India and Vivek Laha was a
Senior Research Fellow of the Council of Scientific and Industrial Research (CSIR), New Delhi,
Ministry of Human Resources Development, Government of India at Department of Mathematics,
Banaras Hindu University.
Currently, Vivek Laha is supported by the Postdoctoral Fellowship of National Board of Higher
Mathematics, Department of Atomic Energy, Government of India (Ref. No. 2/40(47)/2014/R &
D-II/1170).
134 S.K. Mishra et al.

References

Abadie JM (1967) On the Kuhn-Tucker theorem. In: Abadie JM (eds) Nonlinear programming.
Wiley, New York, pp 21–36
Achtziger W, Kanzow C (2008) Mathematical programs with vanishing constraints: optimality
conditions and constraint qualifications. Math Progr 114:69–99
Aghezzaf B, Hachimi M (2004) Second order duality in multiobjective programming involving
generalized type I functions. Numer Funct Anal Optim 25(7–8):725–736
Aghezzaf B, Hachimi M (2001) Sufficiency and duality in multiobjective programming involving
generalized .F; /convexity. J Math Anal Appl 258:617–628
Bajara BS, Goode JJ, Nashed MZ (1974) On the cones of tangents with applications to
mathematical programming. J Optim Theory Appl 13:389–426
Bigi G, Pappalardo M (1999) Regularity conditions in vector optimization. J Optim Theory Appl
102(1):83–96
Bao TQ, Mordukhovich BS (2007) Existence of minimizers and necessary conditions in set-valued
optimization with equilibrium constraint. Appl Math 52:453–472
Bao TQ, Gupta P, Mordukhovich BS (2007) Necessary conditions in multiobjective optimization
with equilibrium constraints. J Optim Theory Appl 135:179–203
Bao TQ, Gupta P, Mordukhovich BS (2008) Suboptimality conditions for mathematical programs
with equilibrium constraints. Taiwan. J Math 12(9):2569–2592
Chinchuluun A, Pardalos PM (2007) A survey of recent developments in multiobjective optimiza-
tion. Ann Oper Res 154:29–50
Clarke FH (1983) Optimization and nonsmooth analysis. Wiley-Interscience, New York
Eschenauer H, Koski J, Osyczka A (eds) (1990) Multicriteria design optimization: procedures and
applications. Springer, Berlin
Facchinei F, Pang J-S (2003) Finite-dimensional variational inequalities and complementarity
problems. Springer, New York
Gould FJ, Tolle JW (1971) A necessary and sufficient qualification for constrained optimization.
SIAM J Appl Math 20:164–172
Guignard M (1969) Generalized Kuhn-Tucker conditions for mathematical programming problems
in a Banach space. SIAM J Contr 7:232–241
Hoheisel T, Kanzow C (2007) Würzburg: first and second order optimality conditions for
mathematical programs with vanishing constraints. Appl Math 52(6):495–514
Hoheisel T, Kanzow C (2008) Stationary conditions for mathematical programs with vanishing
constraints using weak constraint qualifications. J Math Anal Appl 337:292–310
Hoheisel T, Kanzow C (2009) On the Abadie and Guignard constraint qualifications for mathemat-
ical programmes with vanishing constraints. Optimization 58(4):431–448
Hoheisel T, Kanzow C, Outrata JV (2010) Exact penalty results for mathematical programs with
vanishing constraints. Nonlinear Anal 72:2514–2526
Izmailov AF, Solodov MV (2009) Mathematical programs with vanishing constraints: optimality
conditions, sensitivity, and a relaxation method. J Optim Theory Appl 142:501–532
Koski J (1993) Multicriteria optimization in structural design: state of the art. In: Proceedings of
the 19th design automation conferences, Albuquerque. ASME, pp 621–629
Li XF (2000) Constraint qualifications in nonsmooth multiobjective optimization. J Optim Theory
Appl 106(2):373–398
Liang Z-A, Huang H-X, Pardalos PM (2003) Efficiency conditions and duality for a class of
multiobjective fractional programming problems. J Global Optim 27:447–471
Lin P, Zhou P, Wu CW (2011) Multiobjective topology optimization of end plates of proton
exchange membrane fuel cell stacks. J Power Sources 196(3):1222–1228
Luo Z-Q, Pang J-S, Ralph D (1996) Mathematical programs with equilibrium constraints.
Cambridge University Press, Cambridge
Maeda T (1994) Constraint qualifications in multiobjective optimization problems: differentiable
case. J Optim Theory Appl 80(3):483–500
6 Multiobjective Optimization Problems with Vanishing Constraints 135

Maeda T (2004) Second order conditions for efficiency in nonsmooth multiobjective optimization
problems. J Optim Theory Appl 122(3):521–538
Mangasarian OL (1969) Nonlinear programming. McGrawHill, New York
Min S, Nishiwaki S, Kikuchi N (2000) Unified topology design of static and vibrating structures
using multiobjective optimization. Comput Struct 75:93–116
Mishra SK, Wang, S-Y, Lai KK (2005) Optimality and duality for multiple-objective optimization
under generalized type I univexity. J Math Anal Appl 303:315–326
Mordukhovich BS (2004) Equilibrium problems with equilibrium constraints via multiobjective
optimization. Optim Methods Softw 19:479–492
Mordukhovich BS (2006) Variational analysis and generalized differentiation, II: applications.
Grundlehren series (fundamental principles of mathematical sciences), vol 331. Springer,
Berlin
Mordukhovich BS (2009) Multiobjective optimization problems with equilibrium constraints.
Math Progr Ser B 117:331–354
Outrarata JV, Kočvara M, Zowe J (1998) Nonsmooth approach to optimization problems with
equilibrium constraints. Kluwer Academic, Dordrecht
Peterson DW (1973) A review of constraint qualifications in finite-dimensional spaces. SIAM Rev
15:639–654
Preda V, Chitescu I (1999) On constraint qualifications in multiobjective optimization problems:
semidifferentiable case. J Optim Theory Appl 100(2):417–433
Rockafellar RT (1970) Convex analysis. Princeton University Press, Princeton
Stadler W (1984) Multicriteria optimization in mechanics. Appl Mech Rev 37:277–286
Chapter 7
A New Hybrid Optimization Algorithm for
the Estimation of Archie Parameters

Jianjun Liu, Honglei Xu, Guoning Wu, and Kok Lay Teo

Abstract Archie formula, which contains three fundamental parameters (a, m, n),
is the basic equation to compute the water saturation in a clean or shaly formation.
These parameters are known as Archie parameters. To identify accurately the water
saturation for a given reservoir condition, it depends critically on the accurate
estimates of the values of Archie parameters (a, m, n). These parameters are
interdependent and hence it is difficult to identify them accurately. So we present
a new hybrid global optimization technique, where a gradient-based method with
BFGS update is combined with an intelligent algorithm called Artificial Bee Colony.
This new hybrid global optimization technique has both the fast convergence of
gradient descent algorithm and the global convergence of swarm algorithm. It is
used to identify Archie parameters in carbonate reservoirs. The results obtained
are highly satisfactory. To further test the effectiveness of the new hybrid global
optimization method, it is applied to ten non-convex benchmark problems. The
outcomes are encouraging.

Keywords Archie parameters • Hybrid global optimization • ABC algorithm •


Gradient-based method

7.1 Introduction

An accurate identification of oil reserve in either an undeveloped or a developed


reservoir is a significant task for a petro-physicist and reservoir engineer. To
calculate the hydrocarbon reserve in its formation, it is required to know the amount
of the water saturation. Inaccurate calculation of the amount of the water saturation
will lead to a large error in the estimation of the hydrocarbon reserve.Archie

J. Liu () • G. Wu
College of Science, China University of Petroleum, Beijing 102249, China
e-mail: liujj@cup.edu.cn
H. Xu
Department of Mathematics and Statistics, Curtin University, Perth, WA 6845, Australia
K.L. Teo
School of Mathematics and Statistics, Curtin University, Perth, WA, 6845, Australia

© Springer-Verlag Berlin Heidelberg 2015 137


H. Xu et al. (eds.), Optimization Methods, Theory and Applications,
DOI 10.1007/978-3-662-47044-2_7
138 J. Liu et al.

equation, which is the underlying foundation for analyzing water saturation in


potential oil and gas zones, is commonly used in the calculation of the amount of the
water saturation of a reservoir rock, and hence providing an estimate of the initial
hydrocarbon reserve of the reservoir. The values of the input parameters of Archie
water saturation model in a clean or shaly formation must be estimated as accurately
as possible. From field experiments, it is observed that the values of the following
three input parameters, the cementation exponent m, the saturation exponent n,
and the tortuosity factor a, depend critically on the petro-physical properties of a
given rock. Thus, these parameters will take different values for different fields.
Furthermore, the values of a, m and n in Archie formula are interdependent.
Therefore, it is of critically importance that the estimations of the values of the
parameters a, m and n are done accurately (Mabrouk et al. 2013; Michael et al.
2013; Makar and Kamel 2012).
There are several techniques available in the literature for estimating the values
of Archie parameters a; m and n. A conventional technique is to determine n
separately, independent of a and m. This approach may not be valid in real
situations, as it may lead to large error in the estimation of the amount of the
water saturation (Archie 1942). In (Maute et al. 1992), a data analysis approach
is proposed to determine the Archie parameters m; n and a based on standard
resistivity measurements on the core samples. The simplex method is applied in
(Chen et al. 1995) to identify the three parameters of Archie equation. In (Hamada
et al. 2002, 2013), Archie parameters are being estimated by an approach based
on a three-dimensional (3D) regression plot involving water saturation, formation
resistivity and porosity. In (Godarzi et al. 2012), an intelligent algorithm, GA, is
applied to estimation of Archie parameters.
In reality, the parameters a, m and n in Archie equation are known to be closely
interdependent. In this paper, a new hybrid optimization technique is proposed to
estimate these Archie parameters. Several carbonate reservoirs are chosen to carry
out comparative experiments using the proposed hybrid optimization technique and
other existing techniques.
The rest of the paper is organized as follows. In Sect. 7.2, the model of Archie
equation is described, where the three crucial parameters, which are required to be
estimated accurately, are clearly indicated. In Sect. 7.3, a new hybrid optimization
method is developed. Its properties are being revealed. In Sect. 7.4, the new hybrid
optimization technique is applied to estimate the values of Archie parameters. To
better appreciate the effectiveness of the proposed hybrid optimization technique,
it is applied to solve some benchmark global optimization problems in Sect. 7.5.
Finally, some concluding remarks are made in Sect. 7.6.

7.2 Estimations of Archie Parameters

7.2.1 Archie Equation

Archie equation, which relates the resistivity of the formation to the porosity, the
water saturation and the formation water resistivity, is expressed as:
7 A New Hybrid Optimization Algorithm for the Estimation of Archie Parameters 139

 1n
F Rw
Sw D (7.1)
Rt

where Sw is the amount of the water saturation (fraction or percentage), n is the


saturation exponent, Rw is the formation water resistivity ( m), Rt is the true
formation resistivity ( m), and F is the formation resistivity factor (dimensionless).
It is known that the formation resistivity factor F is closely related to the porosity
of the formation. From the well-known Archie formula reported in (Archie 1942),
F can be expressed approximately as:
a
FD (7.2)
ˆm
where ˆ is the porosity (dimensionless), a is the tortuosity factor (dimensionless),
and m is the cementation exponent (dimensionless).
Archie Eq. (7.1), with ˆ being obtained from independent porosity logs, is
commonly used to estimate the amount of the water saturation Sw provided that
the values of the parameters a, m and n are known. Here, Archie parameters, i.e., a,
m and n, are the most important parameters to be estimated accurately, as they will
affect the accuracy of the estimation of the amount of the water saturation.
The cementation factor, m, varies for different rock types. Its value ranges from
1.2 to 3. However, it is usually assumed to have a value of 2. The saturation
exponent, n, is a critical parameter in petro-physics, determining a quantitative
relationship between the electrical properties of a reservoir rock and the amount
of the water saturation in its formation. In (Ara et al. 2001), the saturation exponent
n is reported to be less than 2 (for strongly water-wet rocks) and above 25 (strongly
oil-wet rocks). Like the cementation factor, it is assumed conventionally to be equal
to 2. The tortuosity factor, a, is usually assumed to be equal to the value of unity.
Obviously, incorrect assumptions of the values of these parameters can lead to large
errors in the estimation of the amount of the water saturation, and consequently the
estimation of the hydrocarbon reserve in a formation.

7.2.2 The Model of CAPE (Core Archie-Parameters


Estimation)

Among the various methods available in the literature, Core Archie-Parameters


Estimation (CAPE) (Maute et al. 1992; Enikanselu and Olaitan 2013), which is
an analysis method, is to estimate the Archie parameters a, m and n by minimizing
the error between the computed water saturation and the measured water saturation.
It is observed that the regressions of IR D Swn vs Sw and F vs ˆ based on their
plots are not the optimum way to obtain the Archie parameters. For the conventional
method, the errors, expressed as least squares form, are minimized with respect to
the parameters such as the formation factor, F, and the resistivity index, IR: For
140 J. Liu et al.

CAPE, the error function, expressed as the difference between the computed and
the measured water saturations given by

X
M X
N  1n !2
aRwij
f .m; n; a/ D Swij  (7.3)
jD1 iD1
ˆj m Rt ij

is minimized subject to 1:2  m  3:0, 1:0  n  3:0; and 0:5  a  1:5


to obtain the values of the Archie parameters, where j is core index, i is index for
each of the core j measurements, Swij is ith laboratory measured water saturation for
corej (fraction), Rtij is ith laboratory measured resistivity for core j, and  j is core
jporosity (fraction). This function f (m, n, a) may hold multiple stationary points with
the different values of Swij , Rtij and  j , so it would be likely to has several optima.
The three parameters in the function f (m, n, a) are interdependent. For a specific
carbonate corn sample, three surfaces of f (m, n, a) as a function of n and a with
m D 2, m and a with n D 2, and m and n with a D 1, are depicted in Fig. 7.1. From
the figures, we can see the correlation of the three parameters m, n and a in Archie
equation.
The validity of CAPE is under the following two assumptions: (i) Archie formula
is valid for the carbonate core sample concerned; and (ii) the core sample used is a
valid representative of the zone under consideration. Based on both assumptions and
the model of CAPE, we shall develop a new hybrid optimization method to solve
the minimization problem (3) as a box-constrained optimization problem.

7.3 A New Hybrid Algorithm

The methods to solve the global optimization problems (GOPs) can be classified into
two main classes: deterministic methods, and stochastic methods. The first class,
which makes use of the some deterministic information to solve GOPs, includes
the tunneling method (Levy and Gomez 1985), and filled function methods (Liang
et al. 2007; Liu and Xu 2004). The second class, which relies on probabilistic
techniques, includes Ant Colony Optimization algorithm (Toksari 2006), Genetic
Algorithm (Goldberg 1989; Michaelewicz 1996), Simulated Annealing algorithm
(Kirpatrick et al. 1983), and Particle Swarm Optimization (Eberhart and Kennedy
1995; Kennedy and Eberhart 1995; Shi and Eberhart 1998). Especially, Artificial
Bee Colony (ABC) holds better performance than most of the stochastic algorithms
mentioned above (Karaboga and Basturk 2007, 2008). The deterministic algorithms
which are gradient-based converge rapidly. However, they will get stuck in local
minima of a multimodal function. On the other hand, stochastic optimization algo-
rithms, which tend to perform global optimization, are computationally expensive
doing random searches. Therefore, an approach that combines thestrengths of
7 A New Hybrid Optimization Algorithm for the Estimation of Archie Parameters 141

Fig. 7.1 The three surfaces m=2


of f (m, n, a) as a D 1, m D 2 6
and n D 2, respectively 8

5
6
4

4
3

2 2

1
0
3
2 3
1 1 2
n a
n=2

40
35

30 30

25
20
20

10 15

10
0
3 5
2
1
1 2
3
a m

a=1

20 14

12
15
10
10
8

5 6

4
0
3 2
2 3
2
m 1 1
n
142 J. Liu et al.

stochastic and deterministic optimization schemes but avoids their weaknesses is


of much interest. For details on such an approach, see (Noel 2012; Yiu et al. 2004;
Garcia-Palomares et al. 2006).
In this section, a hybrid algorithm, which combines the strengths of a gradient-
based optimization technique and ABC algorithm, will be presented.
Let f (x) bena twice ˇcontinuously
o differentiable non-convex function defined on
ˇ
the set  D x 2 Rn ˇ a  x  b , where a and b 2 Rn . We assume that all the
minima of f (x) are isolated minima and that there is a finite number of them. We
consider the problem of finding the global minimum of f (x) on the set .
The new hybrid descent method may be formally stated as follows:
Step 1. Initialization. Generate x0 randomly and evaluate f (x0 ). Set k: D 0.
Step 2. Local search. Search for a local minimum of f (x) by using a gradient-based
algorithm with xk as the initial point. Let it be denoted as xkC1 .
If xj kC1 < lbj ; then xj kC1 D lbj and else if xj kC1 > ubj ; then xj kC1 D ubj .

Then  set x.kC1/
 DxkC1 .
   
If f x.kC1/ < f xk , then x D x.kC1/ , else x D xk . Goto Step 3.

Step 3. Find a better solution by carrying out ABC algorithm.


Set yk WD x as the current minimizer and then execute the ABC iterations
until the stopping criteria of the ABC algorithm are met. Output a new global
minimizer y*.
Step 4. Set k: D k C 1, xk D y . Return to Step 2 until convergence.
In Step 2 of the algorithm, the quasi-Newton algorithm with BFGS update is
used to perform the local search. In Step 3, the search of ABC composes of three
key steps (Karaboga and Basturk 2007, 2008). They are:
1. Sending the employed bees into the food sources and then measuring their nectar
amounts;
2. Selecting the food sources by the onlookers after sharing the information of
employed bees and determining the nectar amount of the foods;
3. Determining the scout bees and then sending them onto possible food sources.
In ABC algorithm, the artificial bee colony consists of three kinds of bees:
employed bees, onlookers and scouts. Half of the colony is made up of employed
bees, and the other half includes onlooker bees and scouts. Employed bees search
for the food around the food source in their memory; meanwhile they share their
food information with onlookers. Onlooker bees tend to select better food sources
from those found by employed bees, and then further search for the food around
the selected food source. Scouts abandon these food sources and search for new
ones. Whenever a scout or onlooker bee finds a food source, it becomes an
employed bee. Whenever a food source is exploited fully, all the employed bees
associated with it abandon it, and become scouts or onlookers. Scout bees perform
the job of exploration, whereas employed and onlooker bees perform the job of
exploitation.
7 A New Hybrid Optimization Algorithm for the Estimation of Archie Parameters 143

In ABC algorithm, the position of a food source is a potential solution of the


optimization problem and the nectar amount of a food source corresponds to the
fitness of the associated solution. The number of employed bees (N) is equal to
the number of food sources (SN) because it is assumed that for every food source,
there is only one employed bee. After generating a randomly distributed initial
population of size SN of solutions, each of the employed and onlooker bees exerts a
probabilistically modification on the solution (the position of a food source) for
finding a new solution (new food source position) and tests the fitness (nectar
amount) of this new solution (new food source).
Suppose each solution consists of D parameters and let Yit D yti1 ; yti2 ; ; ytiD
denotes to the i-th solution generated in cycle t with parameter values
yti1 ; yti2 ; ; ytiD . In the ABC algorithm, every employed bee produces a new solution
Vi D vi1 ; vi2
t t t
; ; viD
t
in a D-dimensional search space, from the old one Y ti
according to the following equation

vijt D ytij C 'ijt ytij  ytkj (7.4)

where j 2 f1; 2; : : : ; Dg, and k is selected randomly fromf1, 2, : : : , Ng such


that k ¤ i. ®tij is a random scaling factor. When all employed bees have finished
their searching process, they share the fitness (nectar) information of their solution
(food sources) with the onlookers. Then each of these onlookers selects a solution
according to a probability proportional to the fitness value of that solution. Equation
(7.4) is applied again to generate a new solution by an onlooker bee based on the
old solution in her memory and the selected one. If the fitness amount of the new
solution is better than the old one, the bee memorizes the new position and forgets
the old one. The probability value, pi , by which an onlooker bee chooses a food
source is calculated according to the following equation:

fiti
pi D XSN (7.5)
fiti
jD1

where fiti is the fitness value of the solution i and SN is the number of food sources.
When the nectar of the food source is abandoned by employed bees, the scout
bees replace it with a new one. In the ABC algorithm, if the quality of a solution
cannot be improved after a predetermined number of cycles called “limit”, the scout
bee replaces the abandoned solution with a new one chosen randomly. In such a
condition, the new solution is constructed according to the following equation

ytij D yjmin C rand .0; 1/ yjmax  yjmin (7.6)

whereyjmin and yjmax are, respectively, the lower and upper bounds on the value of the
jth parameter.
In this paper, the search mechanism, proposed in (Karaboga and Basturk 2007),
is chosen to escape from a local solution in the new hybrid method. The search
144 J. Liu et al.

strategy is determined by the parameters, SN, the number of the food sources which
is equal to the number of employed or onlooker bees; limit, a predetermined number
of cycles; and the maximum cycle number, MCN. The ABC algorithm can be
implemented as follows:
Initiation. Set SN, limit, MCN, and cycle D 1. Generate an initial population of the
potential solutions Yi ; i D 1; 2; : : : ; SN, based on x*, and Evaluate f (Yi ).
repeat
1. Produce new solutions Vi for the employed bees by using Eq. (7.4)
2. Assume that the greedy selection process is adopted by the employed bees
3. Calculate the probability values pi for the solutions Yi by using Eq. (7.5)
4. With the probability values pi , produce the new solutions Vi for the onlookers
from the solutions Yi
5. Assume that the greedy selection process is adopted by the onlookers
6. Determine the abandoned solution for the scout, if exists, and replace it with
a new randomly produced solution Yi by using Eq. (7.6)
7. Memorize the best solution achieved so far
8. cycle D cycle C 1
until cycle D MCN
In the proposed hybrid algorithm, BFGS algorithm is executed from initial point
x0 to find a local minimum of f (x) with high-speed descent. Then the ABC algorithm
is used to escape from the local solution to find a better solution, y*, which will be
taken as the new starting point for the BFGS algorithm in the next cycle. A better
minimum x* found by the BFGS algorithm is taken as the best solution (food source)
in the ABC algorithm and is memorized. If the best solution y* found by the ABC
algorithm has a better value than the former memorized solution x*, then y* is used
as the starting point for the BFGS algorithm. This guarantees that a local search
operates in the neighborhood of the best solution found by the proposed algorithm
in all previous iterations.
Incidentally, the proposed algorithm can be generalized by using more powerful
local search techniques to refine the best solution found in the ABC. Derivative-free
techniques like Nelder–Mead Simplex method or Hooke-Jeeves (Fei Kang et al.
2011) method can be used for the local search when the objective function is not
continuously differentiable.

7.4 Estimation of the Archie Parameters

In this section, the focus is on the estimation of the Archie parameters for carbonate
reservoirs. The carbonate reservoirs are of great importance because they contain
almost 60 % of world’s oil reserves. The accuracy of Eq. (7.1) depends on the
accuracy of the estimates of the input parameters Rw , Rt , and F.
7 A New Hybrid Optimization Algorithm for the Estimation of Archie Parameters 145

Table 7.1 The values of Method a m n


Archie parameters obtained
from three techniques 3D method (Hamada et al. 2013) 0.28 2.34 2.12
CAPE (a, m, n) (Hamada et al. 2013) 0.23 2.15 2.87
BFGSABC 1.14 1.62 2.02

Table 7.2 Error analysis on the determined Archie parameters by three techniques
Absolute error
Method Ea Emin Emax Erms S
3D method (Hamada et al. 2013) 0.102 0.002 0.51 0.14 0.10
CAPE (a, m, n) (Hamada et al. 2013) 0.095 0.001 0.33 0.12 0.08
BFGSABC 0.035 0.004 0.026 0.08 0.07
Note: Ea the average absolute relative error, Emin /Emax the minimum/maximum absolute error, S
the standard deviation. Erms the root mean square error

Hereinafter, 29 carbonate core samples taken from (Hamada et al. 2013) are
selected as the simulations of certain wells, that is N D 29. For each core sample, the
electrical resistivity Rw and Rt at different water saturation percentages are measured
at room temperature, that is M D 30. Set x D .m; n; a/, then BFGSABC algorithm
can be applied to solve the model indicated by Eq. (7.3).
Based on 30 independent core samples measurements, the data obtained by the
proposed hybrid method are compared with the data computed by 3D method and
CAPE.
Typical values of the Archie parameters obtained using the CAPE method,
the 3D method (Hamada et al. 2002, 2013) and the proposed hybrid method are
as shown in Table 7.1. Table 7.2 demonstrates the average error, the root mean
square error and standard deviation of the water saturation computed by three
techniques.
The average error, the root mean square error and standard deviation are shown
in Fig. 7.2. Figure 7.3 displays the linear regression estimation for three techniques.
Typical results of the measured water saturation and the estimated water saturation
profiles for different Archie parameters obtained by using CAPE, 3D and the
proposed hybrid method (BFGSABC) are illustrated in Fig. 7.4a. Figure 7.4b
depicts water saturation relative error profiles calculated by the three options against
selected interval for core samples.
From Figs. 7.2, 7.3, and 7.4, the measured water saturation and the estimated
water saturation profiles calculated by different methods are clearly demonstrated.
These profiles support the accuracy analysis in regards of the performance of
different techniques in order to get the most accurate Archie parameters. Note that
water saturation computed by the proposed technique has a better matching with the
measured water saturation than other two methods.
146 J. Liu et al.

3D Method CAPE (a,m,n) BFGSABC


0.14
0.12
0.1020.095 0.1
0.08 0.08
0.07

0.035

average error RMS error standard deviation

Fig. 7.2 The average error, RMS error and standard deviation between the three techniques

0.9
BFGSABC
0.8 3D Method
CAPE (a,m,n) Method
0.7

0.6
computed Sw

0.5

0.4

0.3

0.2

0.1
0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6
measured Sw

Fig. 7.3 The linear regression estimation for three techniques

7.5 The Performance of the Proposed Algorithm on Some


Test Functions

In order to test the performance of the proposed algorithm, it is applied to several


representative benchmark functions chosen from (Andrei 2008; Jamil and Yang
2013). Those test functions can be classified into two classes. (1) Unimodal func-
tions, which have no other minimum, except one global minimum. The following are
some of such unimodal functions: bowl-shaped Sphere function(f1), valley-shaped
Rosenbrock function (f2 ), steep drops Easom function(f7), other shaped Goldstein-
Price function (f8 ), and Branin function(f9). (2) Multimodal functions, which have
7 A New Hybrid Optimization Algorithm for the Estimation of Archie Parameters 147

a b
1 1

3 3

5 5

7 7

9 9

11 11
No. of core samples

No. of core samples


13 13
Sw-Measured
Sw-CAPE Sw-CAPE
15 15
Sw-3D Sw-3D
Sw-BFGSABC Sw-BFGSABC
17 17

19 19

21 21

23 23

25 25

27 27

29 29
0.2 0.4 0.6 0.8 1 1.2 0 50 100
Ws Relative Error

Fig. 7.4 (a) Comparison between the measured water saturation with the calculated water
saturation for three techniques, and (b) Relative error between three techniques

many local minima. The following are some of such multimodal functions: Rastrigin
function(f3), Ackley function(f4), Griewank function(f5), Schaffer function(f6) and
Levy function (f10 ).The ten test functions, their dimensions and modalities are listed
in details in the table of the Appendix.
In the experiments of BFGSABC on test problems, the number of maximum
generations are 50, 100 and 500 for the dimensions of 2, 10 and 50, and the
population sizes are 20, 20 and 200, respectively. Because BFGS is the deter-
ministic algorithm, only ABC and BFGSABC performed for 50 independent runs
on 10 functions. The calculation is done within the Matlab 7.70 environment.
The computer was functioned with double cores 2.5 GHz CPU PC running in
windows 7.
148 J. Liu et al.

Table 7.3 The global minima of test functions in 2D found by BFGS, ABC and BFGSABC
Best value Mean
Fun. Dim. BFGS ABC BFGSABC ABC BFGSABC
f1 2 1.256E-16 1.12e-12 5.59E-20 1.05e-09 2.29E-20
f2 4.30E-14 3.32E-04 1.94E-18 0.1437 1.32E-18
f3 9.949 8.84E-11 0.0 0.0017 0.0
f4 17.612 5.98E-07 8.88E-16 5.43E-05 3.23E-15
f5 97.449 1.05E-05 0.0 0.0101 0.0
f6 0.494 9.91E-05 0.0 0.0127 0.0
f7 0.0 0.9886 1.0 0.1835 1.0
f8 84.000 3.0000 2.99 3.5450 2.99
f9 0.397 0.3979 0.39 0.3983 0.39
f10 1.844E-15 1.35E-17 4.822E-20 2.47E-12 7.22E-20

Table 7.4 The global minima found by BFGS, ABC and BFGSABC for 10 dimensions
Best value Mean
Fun. Dim. BFGS ABC BFGSABC ABC BFGSABC
f1 10 1.12E-16 3.5834E-04 4.55E-17 0.0883 8.51E-17
f2 1010.67 1.0431 5.91E-11 21.9949 6.91E-11
f3 97.50 1.0878 1.13E-9 5.1409 5.12E-9
f4 19.20 0.2565 3.60E-5 1.9434 6.89E-5
f5 0.147 0.0436 2.65E-14 0.2768 4.46E-14
f10 7.85E-11 2.7850E-05 4.79E-14 0.0011 9.62E-14

Table 7.5 The global minima found by BFGS, ABC and BFGSABC for 50 dimensions
Best value Mean
Fun. Dim. BFGS ABC BFGSABC ABC BFGSABC
f1 50 7.94E-16 4.13E-06 3.58E-16 4.0423E-05 5.45E-16
f2 6.47 E C 6 52.3644 7.6 E-3 225.8868 8.6 E-3
f3 4.64 E C 2 28.6034 5.96E-1 40.8226 9.96E-1
f4 1.93 E C 2 1.8572 3.233E-8 3.3657 6.73E-7
f5 1.60E-2 0.0230 2.90E-13 0.4623 2.90E-12
f10 8.46E-7 5.92E-04 1.39E-11 0.0670 1.39E-11

All the ten functions considered are in 2 dimensions and some of these functions
are in higher dimensions. The results of experiments are listed in following tables.
Bold fonts in Tables 7.3, 7.4, and 7.5 indicate that the BFGS algorithm fails to solve
the problems because of being trapped into a local minimum.
After comparison of the data in Table 7.3, for 10 test functions of 2 dimensions, it
can be show that the global minimums of f3 f8 cannot be found by BFGS. Although
ABC finds the approx global minimums, their accuracy of minimal function values
and the mean value for all of 10 functions is lower than BFGSABC.
7 A New Hybrid Optimization Algorithm for the Estimation of Archie Parameters 149

a b 105
105

Logarithm of function value


initial point
Logarithm of function value

initial point

after BFGS search after BFGS search


0
100 10 After ABC search
After ABC search

10-5 10-5

10-10 10-10

10-15 10-15
0 2 4 6 8 10 0 5 10 15
Iteration Iteration

c 102
d 100
initial point
Logarithm of function value

initial point
after BFGS search Logarithm of function value after BFGS search
After ABC search 10-5 After ABC search
0
10

10-10

10-2
10-15

10-4 10-20
1 2 3 4 5 6 7 1 2 3 4 5 6 7
Iteration Iteration

Fig. 7.5 Typical convergence history for the new algorithm for multimodal functions in 2
dimensions. (a) f3 Rastrigin function; (b) f4 Ackley function; (c) f5 Griewank function; (d) f6
Schaffer function

From Tables 7.4 and 7.5, it is observed that the hybrid method has better
performance than both BFGS and ABC in terms of the best values found for higher
dimensions.
Monotonic convergence, which is a very desirable property, is observed for
the proposed hybrid method. See, for example, the typical convergence histories
for the algorithm on the test functions f3 , f4 , f5 and f6 in 2 dimensions, which
are displayed in Fig. 7.5. Since the ABC method is mainly used for bypassing
the previously converged local minimum and discovering the descent point, the
decrease in function value after executing each ABC search might be small.
In addition, we study two multimodal functions f5 and f10 in 1,000 dimensions.
The maximum numbers of generations are 2,000 and the population size is 200.
These two functions have many local minima, which are regularly distributed.
Table 7.6 shows the best and mean values, CPU time and numbers of function
evaluation.
150 J. Liu et al.

Table 7.6 The optimal information on two 1,000 dimensions test functions by the proposed
method
Function Best value Mean CPU time (s) Number of function evaluation
f5 1.33E-8 7.90E-15 1,755 6.3E C 7
f10 1.01E-5 5.22E-2 2,713 6.6E C 7

From Tables 7.4, 7.5, and 7.6, it can be observed that the proposed hybrid method
can find the best “global” minima when compared with BFGS and ABC methods
available for the ten test functions. Furthermore, the success rate of finding the
“global” minima is 100 % for the new proposed hybrid method.
It can be concluded that the hybrid method proposed in this paper has better
performance in solving global optimization problems, especially in the rate of
convergence speed, reliability and the quality of the solution obtained.

7.6 Conclusions

For an accurate estimation of the Archie parameters, a new hybrid global search
method, which combines the well-known quasi-Newton algorithm (BFGS) and the
populated global search algorithm (ABC), is proposed. The ABC technique plays
the role of escaping from a local minimum to a better descent point from which
the local search can restart to find a better minimum. The hybrid method inherits
both the convergent rate and accuracy of the BFGS and the capability of escaping
from local minima of the ABC. Numerical results on ten benchmark problems have
shown that global minimum, especially for multimodal continuous functions, can
be sought using this hybrid descent method with very nice monotonic convergence
history. The results obtained from the simulation experiments on carbonate core
samples show that the proposed method has better performance than other methods
in terms of the accurate estimates of the Archie parameters. From the experiment
results, we observe that the water saturation computed by the proposed method
matches well with the measured water saturation when compared with the other
two methods.

Acknowledgements The authors gratefully acknowledge the financial support from the National
Natural Science Foundation of China (Grant No. 11371371, No. 11171079) and the Foundations
of China University of Petroleum (No. KYJJ2012-06-03, KYJJ2012-12).
Appendix: Expressions and Properties of Ten Test Problems

Optimal Optimal
Fun Expression Range solution value Modalities
Xn
f1 fSph D x2i 100  xi  100 .0;    ; 0/ 0 Uni
iD1
n1 h
X i
2
f2 fRos D 100  xi1  x2i C .xi  1/2 30  xi  30 .0;    ; 0/ 0 Uni
iD1
n
X  
f3 fRas D 10n C x2i  10 cos .2xi / 5:12  xi  5:12 .0;    ; 0/ 0 Multi
iD1 v
u n
u X n
X
u
1u1
5tn x2i  1n cos.2xi /
iD1 iD1
f4 fAck D 20 C e  20e e 32:768  xi  32:768 .0;    ; 0/ 0 Multi
n
X n
Y  
1 xi
f5 fGri D 4000 x2i  cos p i
C1 600  xi  600 .0;    ; 0/ 0 Multi
iD1 iD1
q
sin2 x21 C x22  0:5
f6 fSch D 0:5 C  2 100  xi  100 (0, 0) 0 Multi
1 C 0:001 x21 C x22
2
.x2 /2
f7 fEas D  cos x1 cos x2 e.x1 / 100  xi  100 .; / 1 Uni
(continued)
7 A New Hybrid Optimization Algorithm for the Estimation of Archie Parameters
151
152

(continued)
Optimal Optimal
Fun Expression
h i Range solution value Modalities
fGP D 1 C .x1 C x2 C 1/2 19  14x1 C 3x21  14x2 C 6x1 x2 C 3x22
f8 h i 2  xi  2 .0; 1/ 3 Uni
30 C .2x1  3x2 /2 18  32x1 C 12x21 C 48x2  36x1 x2 C 3x22
2
f9 fBra D a x2 –bx21 C cx1  r C s .1  t/ cos .x1 / C s; with 5  x1  100  x2  15 .; 12:275/ ; 0.397887 Uni
5:1 5 1 (, 2.275),
a D 1; b D 4 2 ; c D  ; r D 6; s D 10; t D 8
(9.42478, 2.475)
n1 h
X i
fLve D sin2 .y1 / C .yi  1/2 1 C 10sin2 .yi C 1/
f10 iD1
5  xi  10 (1, 1, : : : , 1) 0 Uni
2 2 xi 1
C.yn  1/ 1 C 10sin .2yn / ; where yi D 1 C 4
:
J. Liu et al.
7 A New Hybrid Optimization Algorithm for the Estimation of Archie Parameters 153

References

Andrei N (2008) An unconstrained optimization test functions collection. Adv Model Optim
10(1):147–161
Ara TS, Talabani S, Atlas B, Vaziri HH, Islam MR (2001) In-depth investigation of the validity of
the Archie Equation in carbonate rocks, SPE 67204, pp 1–10
Archie GE (1942) The electrical resistivity log as an aid in determining some reservoir character-
istics. Trans AIME 146:54–62
Chen DS et al (1995) Novel approaches to the determination of Archie parameters I: simplex
method. SPE Adv Technol Ser 3(1):39–43
Eberhart RC, Kennedy J (1995) A new optimizer using particle swarm theory. In: Proceedings of
the 6th symposium on micro machine and human science, Nagoya, Japan, pp 39–43
Enikanselu PA, Olaitan OO (2013) Determination of Archie parameters and the effect on water
saturation over “honey” field, Niger-delta. Can J Comput Math Nat Sci Eng Med 4(4):306–314
Fei Kang, Junjie Li, Zhenyue Ma, Haojin Li (2011) Artificial bee colony algorithm with local
search (HJ)for numerical optimization. J Softw 6(3):490–497
Garcia-Palomares UM, Gonzalez-Castan FJ, Burguillo-Rial JC (2006) A Combined Global &
Local Search (CGLS) approach to global optimization. J Glob Optim 34:409–426
Godarzi AA et al (2012) The simultaneous determination of Archie’s parameters by application
of modified genetic algorithm and HDP methods: a comparison with current methods via two
case studies. Pet Sci Technol 30(1):54–63
Goldberg DE (1989) Genetic algorithms in search, optimization and machine learning. Mas-
sachusetts: Addison-Wesley
Hamada GM, Al-Awad MNJ, Alsughayer AA (2002) Water saturation computation from labora-
tory, 3D Regression. Oil Gas Sci Technol Rev. IFP 57(6):637–651
Hamada GM, Almajed AA, Okasha TM et al (2013) Uncertainty analysis of Archie’s parameters
determination techniques in carbonate reservoirs. J Pet Explor Prod Technol 3(1):1–10
Karaboga D, Basturk B (2007) A powerful and efficient algorithm for numerical function
optimization: artificial bee colony (ABC) algorithm. J Glob Optim 39(3):459–471
Karaboga D, Basturk B (2008) On the performance of artificial bee colony (ABC) algorithm. Appl
Soft Comput 8(1):687–697
Kennedy J, Eberhart RC (1995) Particle swarm optimization. In: Process of IEEE international
conference on neural networks, Piscataway, pp 1942–1948
Kirpatrick S, Gelatt CD, Vecchi MP (1983) Optimization by simulated annealing. Science, New
Series 220(4598):671–680
Levy AV, Montalvo A (1985) The tunneling algorithm for the global minimization of functions.
SIAM J Sci Stat Comput 6(1):15–29
Liang YM, Zhang LS, Li MM, Han BS (2007) A filled function method for global optimization. J
Comput Appl Math 205:16–31
Liu X, Xu W (2004) A new filled function applied to global optimization. Comput Oper Res
31:61–80
Mabrouk WM, Soliman KS, Anas SS (2013) New method to calculate the formation water
resistivity (Rw). J Pet Sci Eng 10:49–52
Makar KH, Kamel MH (2012) An approach for velocity determination from merging Archie and
Raymer–Hunt–Gardner transform in reservoir of clean nature. J Pet Sci Eng 86–87:297–301
Maute RE, Lyle WD, Sprunt E (1992) Improved data-analysis method determines Archie
parameters from core data. J Pet Technol 44(1):103–107
Michael R, Collett TS et al (2013) Large-scale depositional characteristics of the Ulleung Basin and
its impact on electrical resistivity and Archie-parameters for gas hydrate saturation estimates.
Mar Pet Geol 47:222–235
Michaelewicz Z (1996) Genetic algorithms C Data structures D Evolution programs. Springer,
Berlin
154 J. Liu et al.

Momin Jamil, Xin-She Yang (2013) A literature survey of benchmark functions for global
optimization problems. Int J Math Model Numer Optim 4(2):150–194
Noel MM (2012) A new gradient based particle swarm optimization algorithm for accurate
computation of global minimum. Appl Soft Comput 12:353–359
Shi Y, Eberhart RC (1998) A modified particle swarm optimizer. In: Proceedings of the IEEE
Congress on Evolutionary Computation (CEC 1998), Piscataway, pp 69–73
Toksari MD (2006) Ant colony optimization for finding the global minimum. Appl Math Comput
176(1):308–316
Yiu KFC, Liu Y, Teo KL (2004) A hybrid descent method for global optimization. J Glob Optim
28:229–238
Chapter 8
Optimization of Multivariate Inverse Mixing
Problems with Application to Neural Metabolite
Analysis

A. Tamura-Sato, M. Chyba, L. Chang, and T. Ernst

Abstract A mathematical methodology is presented that optimally solves an


inverse mixing problem when both the composition of the source components
and the amount of each source component are unknown. The model is useful
for situations when the determination of the source compositions is unreliable
or infeasible. We apply the model to longitudinal proton magnetic resonance
spectroscopy (1H MRS) data gathered from the brains of newborn infants. 1H
MRS was used to study changes in five metabolite concentrations in two brain
regions of nine healthy term neonates. Measurements were performed three times
in each infant over a period of 3 months, starting from birth, for a total of 27
scans. The methodology was then used to translate the metabolite concentration
data into measures of relative density for two major brain cell type populations
by fitting a matrix of metabolite concentration per unit density to the data. One
cell type, reflecting neuronal density, increased over time in both regions studied,
but especially in the frontal regions of the brain. The second type, characterized
primarily by myoinositol, reflecting glial cell content, was found to decrease in
both regions over time. Our new method can provide more specific and accurate
assessments of the brain cell types during early brain development in neonates. The
methodology is applicable to a wide range of physical systems that involve mixing
of unknown source components.

A. Tamura-Sato () • M. Chyba


Department of Mathematics, University of Hawai‘i at Mānoa, 2565 McCarthy Mall,
96822 Honolulu, Hawai‘i
e-mail: aaronts@hawaii.edu
L. Chang • T. Ernst
Department of Medicine, University of Hawai‘i at Mānoa, Honolulu, Hawai‘i

© Springer-Verlag Berlin Heidelberg 2015 155


H. Xu et al. (eds.), Optimization Methods, Theory and Applications,
DOI 10.1007/978-3-662-47044-2_8
156 A. Tamura-Sato et al.

8.1 Introduction

Many scientific problems can be described as combining multiple sources to form a


final mixture; this is denoted herein as a multivariate mixing problem. Each source
may contain an amount of certain components we are interested in tracking. For
example, if the salt content of two different source solutions of salt water is known,
then one can determine the salt content of mixtures made from the two sources, as
long as the amount of each source added to the mixtures is known. Similarly, given
several bronze alloy sources made of different amounts of copper, lead, and zinc,
one could determine the copper, lead, and zinc content of a final product obtained
by mixing the sources. This can be modeled by the matrix equation AX D C, where
the composition matrix A contains information on the amount of each component
per unit of source, and the population matrix X contains the contribution of each
source to each final product. Then C is naturally a matrix which gives the amount
of each component in each final product.
The goal of the inverse mixing problem is to reverse this calculation. Now, we
know the amount of each of the components in the final products, but wish to
determine how much of each source was used to create them. In other words, we
know C, but need to solve for X. In many applications, the composition matrix A
is known or can be determined by secondary experiments, but this is not always
feasible. In this paper, we will analyze the situation where the composition matrix
is unknown.
When calculating X in an inverse mixing problem, it is frequently the case that
the system is overdetermined and no X can be found that perfectly fits the equation
AX D C. Instead, X is calculated using a least squares method that minimizes
kEk D kC  AXk. Such optimization techniques have been used and refined for
a variety of scientific applications, including chemistry and geophysics (Cantrell
2008; Snieder and Trampert 1999). In these applications, the composition matrix
A is known. However, in some situations, A cannot be determined or the data
collected for A is unreliable or ultimately insufficient for the analysis. For instance,
an attempted analysis on streamwater was inconclusive due to insufficient data on
measured source compositions (Christopherson et al. 1990). Therefore in our model,
we attempt to find a best fit solution for A and X simultaneously since we lack
information about the matrix A. To guide our fit, we use certain constraints on the
matrices A and X that reflect the physical reality of the system.
While principal component analysis (PCA) is sometimes used as a method
to extract suitable parameter values for A from data, it is not sufficient for this
task. PCA can provide the rank of A that should be used, but provides little
information useful in determining what the composition of the sources should be
(Christopherson and Hooper 1992). In addition, PCA principal components must be
orthogonal to each other and the first principal component is chosen to explain as
much of the variance as possible. Such restrictions may not be realistic or desired.
Our approach allows us to explain most of the variance without forcing the columns
of the composition matrix to be orthogonal, and without assuming that one particular
source causes most of the variance in the data.
8 Optimization of Multivariate Inverse Mixing Problems with Application to. . . 157

It is our goal to apply an optimization procedure that minimizes residual values


using an interior point algorithm to a situation where the composition matrix is
unknown. Of importance to our methodology is the enforcing of constraints on the
individual components of A and X. This differentiates our approach from PCA or
sparse PCA techniques, which can suppose some constraints, but not on individual
components (Hunter and Takane 2002; Takane and Hunter 2001; Zou et al. 2006).
Thus, our methodology optimizes A and X to deliver the best fit to known data,
subject to certain constraints.
Our approach is motivated by an application where composition data are impossi-
ble to determine. Specifically, we will apply our methodology to the analysis of data
gathered by proton magnetic resonance spectroscopy (1H MRS) from the neonatal
brain in order to determine the relative level of density of multiple populations
of cells in different regions of the brain. To our knowledge, this is the first time
such an analysis has been performed. 1H MRS is a non-invasive spectroscopic
technique that allows the measurement of several brain metabolites, and has been
used to evaluate the early developing brain (Kreis et al. 2002; Pouwels et al. 1999).
One of the strengths of MRS is its promise to identify and characterize various
cellular compartments in the developing brain, such as neurons, glial cells (Brand
et al. 1993; Guimaraes et al. 1995), and possibly neural stem cells (Manganas et al.
2007). In the clinical setting, 1H MRS has found application in the evaluation and
diagnosis of hypoxia (Ancora et al. 2010; Cheong et al. 2006; van Doormaal et al.
2012), shaken baby syndrome (Haseler et al. 1997), metabolic diseases (Befroy and
Shulman 2011), the effects of preterm delivery on the brain (Wang et al. 2008), and
many other brain disorders. 1H MRS has also been used to study brain biochemical
changes and maturation in healthy adults and neonates (Kirov et al. 2008). However,
the interpretation of MRS findings in terms of anatomical and physiological aspects
of the brain can be difficult since the data reflect a heterogeneous distribution of
cells within a specific region of interest.
In this particular application, we cannot determine the composition matrix A
since pure populations of specific neural cell types are not found in vivo, and
invasive measurements are impossible. We will assess for two major cell types,
characterized by tentative metabolic markers for glia and neurons, and attempt to
determine the density level of each type in two separate regions of the neonatal brain.
Further work is ongoing to examine several other regions and disease conditions.
Our methodology therefore seeks to simultaneously determine A, the matrix of
metabolic concentration per unit of brain cell density, and X, the matrix of brain
cell density in each MRS experiment, given C, a matrix of metabolic concentrations
measured by MRS. The goal is to optimize the norm of a percentage error matrix
calculated from E D C  AX, subject to certain known constraints, such as non-
negativity for elements of A and X.
In this paper, we will first introduce our methodology in general terms.
We will then briefly describe the experimental procedure for the MRS study,
before applying the methodology to the specific data acquired and analyzing the
results.
158 A. Tamura-Sato et al.

8.2 Methodology

We introduce the mathematical model to translate measured component values


(such as total metabolite or chemical ion concentrations) into relative contributions
of sources (such as cell density or chemical solutions). We assume conservation
of all components in the mixture. For example, there are no chemical reactions
or precipitation. Let us consider a general case with m measurements taken of p
components, with n sources.
We introduce i D . i1 ; i2 ; : : : ; i / the component spectrum per unit of source
p
2
i. Thus, 1 represents the amount of component 2 per unit of source 1. Given
measurement k, the contribution of source i will be denoted xki .
Our model is based on the assumption that for a given measurement k, the
measured value for component j, ck;j , is obtained as a linear superposition of
the component spectra i adjusted by the contribution of each source. In matrix
notation, AX D C where A is the matrix of component spectra and X the matrix of
contributions by each source.
In practice, we usually have an overdetermined system. If the composition matrix
A is known, a least square approach is used, and it is well known that the best
solution to AX D C is given by X  D .AT A/1 AT C D A C where A is the Moore-
Penrose pseudo-inverse (i.e. if c represents a column vector of the matrix C then the
minimal residual to c  Ax is given by x D A c).
However, when A is unknown and we can impose the number of sources, we
must take a different approach. To account for the fact that some components may
have a significantly smaller value than others, we will minimize the percentage error
and not the absolute error. A normalization constraint is imposed on the columns of
A, to prevent scaling from becoming an issue in the optimization. Several different
constraints can be used, such as setting the sum of squares to 1 or requiring columns
of A to be orthogonal, similar to the PCA procedure. Then our problem becomes:
Given the measured component values for our set of m measurements, find the
best fit in terms of component spectra and source contribution that minimizes the
percentage residual.
To find a solution, X must solve the normal equations associated with the
equation AX D C:

AT AX D AT C (8.1)

The matrix AT A is a symmetric n  n matrix, and is invertible provided that


det.AT A/ ¤ 0 (which is assumed in the sequel). Our problem can then be
reformulated as follows. For a given C, determine A such that we minimize the norm
of the percentage residual matrix E% with respect to C, where E% is calculated from
the absolute error matrix, E, given by:

AA C  C D E (8.2)
ek;j
Then the entries of E% are given by e%k;j D 100 %  ck;j
for metabolite j and
measurement k.
8 Optimization of Multivariate Inverse Mixing Problems with Application to. . . 159

Geometrically, this can be interpreted as follows. The operator P D AA is the


orthogonal projection (i.e. P2 D Id) onto the space generated by the columns of A.
Our goal is therefore to identify a matrix A that minimizes the sum of the square of
the percentage residuals (orthogonal to the span of A) over all the measurements
(i.e. when applied to all column vectors of the matrix C). Once a matrix A is
known we can determine the matrix X of source contributions using X D A C.
This methodology is in contrast to the technique used in Christopherson and Hooper
(1992), which is limited to the space spanned by the most relevant axes given by a
PCA analysis. Conversely, our optimization evaluates the entire space.
Depending on the situation, partial information may be known about the values
of A that can be incorporated into the model. Frequently, for example, negative
or x values do not occur in nature; for instance, when concentrations are involved.
It may also be known that certain components are absent from a particular source.
These can be introduced as constraints on the optimization problem.
To simplify our notations further, we introduce y D . 11 ; 12 ; : : : ; 21 ; 22 ; : : : ; np / 2
R . Let E% .y/ denote the percentage error matrix obtained for A determined by y.
np

Then we wish to minimize f .y/ D kE% .y/k, the L2 -norm of E% , subject to equality
and inequality constraints on A. Our optimization problem becomes:

min f .y/; f .y/ D kE% .y/k (8.3)


y

subject to the following constraints:

hi D 0 (8.4)
gj  0 (8.5)

with equality constraints hi and inequality constraints gj .


This constrained optimization problem can be solved with a variety of available
computer software. We use MATLAB R2010b and its optimization tool with
the fmincon solver and the interior point algorithm. The interior point algorithm
attempts to solve the constrained optimization problem by first replacing the
inequality constraints with equality constraints by introducing slack variables, s, and
introducing a new term to the cost function. For example, the inequality constraint
in Eq. 8.5 becomes gj C sj D 0. Then for each > 0, Eq. 8.3 is approximated by
X
min f .y; s/ D min f .y/  ln.sq / (8.6)
y;s y;s
q

and this is subject to only equality constraints, which is an easier class of problem
to solve. Note that as approaches 0, the minimum in Eq. 8.6 should equal
the minimum in Eq. 8.3. Solving the approximated problem is done by taking a
sequence of steps using one of two methods. The first, and default, method linearizes
the Lagrangian of the approximated problem and attempts to find a solution that
satisfies the Karush-Kuhn-Tucker (KKT) conditions (Byrd et al. 2000). If this
160 A. Tamura-Sato et al.

fails, then a conjugate gradient method is used, which attempts to minimize a


quadratic approximation to the problem within a trust region (a neighborhood of
radius R defined by the user and shrunk if no good solution can be found) (Byrd
et al. 1999; Waltz et al. 2006). This minimization is done subject to linearized
constraints. These two methods are repeated until a solution satisfying the stopping
criterion is determined. More specifics can be found in the MATLAB documentation
(Constrained Nonlinear Optimization Algorithms 2014).
Verification of the local optimization can be done by checking sufficient con-
ditions for local minima. The Lagrangian function associated P with our problem
P
(assuming no further constraints) is given by L .y; / D f .y/  i i hi  j j gj
and the second order sufficient condition for a local minimum is that there exist
Langragian multipliers i such that

Dy L .y ; 
/ D 0 where 
i  0 8i;  
j gj .y / D0 (8.7)

and

zT r 2 L .y ; 
/z > 0 8z 2 T 0 ; z ¤ 0

where T 0 WD fv W rhi .y /v D 08ig


The Lagrangian multipliers can be calculated numerically using MATLAB or
several other software programs.
When using software to numerically solve optimization problems, it is important
to recognize that most solvers determine only local solutions, not global ones. To
test if our local minima are in fact global minima, MATLAB’s GlobalSearch solver
was used. Simply put, the GlobalSearch algorithm generates a number of test points
to use as initial starting points for the fmincon solver. The algorithm assumes that
any local minima found by the fmincon solver have spherical basins of attraction,
with radius equal to the Euclidean distance from the local minimum to its associated
starting point. As the algorithm steps through the list of test points, it discards any
that are found to be in existing basins. At the end, it reports the local minimum
with the smallest cost function among the solutions calculated from the test points.
More details can be found in the MATLAB documentation (How GlobalSearch and
MultiStart Work 2014).

8.3 Application to Analysis of Brain Metabolites

In this section we apply our methodology to the analysis of brain metabolite data.
Numerous studies have measured the concentrations of major brain metabolites
using 1H MRS, to evaluate brain biochemical changes and maturation in adults and
neonates (Kreis et al. 1993). However, the interpretation in terms of the health or
specific status of the brain cells is unclear since the data are obtained from a region
with a heterogeneous mixture of cells.
8 Optimization of Multivariate Inverse Mixing Problems with Application to. . . 161

Fig. 8.1 Age division for each subject visit. Box and whisker plots represent the minimum, second
quartile, median, third quartile, and maximum value of all subject data

8.3.1 Human Subject Studies

Nine healthy newborns were studied, two boys and seven girls. Pregnant mothers
were recruited from the maternity ward at the Queen’s Medical Center in Honolulu
and through physician referrals. Each parent or legal guardian signed an informed
consent form approved by our Institutional Review Board, and completed detailed
interviews regarding their medical and drug use histories. Mothers were 18 years or
older at the time of giving birth, and had minimal or no drug use or any other prenatal
complications during pregnancy or perinatal problems during delivery. All babies
were at or near full term for gestational age (36 weeks or more), and were evaluated
thoroughly to ensure they were healthy. Each neonate was scanned three times:
within 1 week of birth, and at approximately 1 and 2 months thereafter. Although
many more neonates were studied, only nine infants with complete datasets for all
five metabolites and good quality data for both brain regions studied are presented.
The age distribution of these nine infants at each visit is shown in Fig. 8.1.

8.3.2 MRI and Localized 1H MRS

MRI studies were performed on a Siemens Trio 3.0 T scanner while the infants
slept (typically after nursing) and were unsedated. All babies had a sagittal
3D-magnetization prepared rapid acquisition by gradient echo (MP-RAGE)
sequence. Also, a T2-weighted 3D-SPACE sequence was acquired to ensure no
lesions were present. Based on anatomical landmarks in the MP-RAGE scan,
spectra were acquired in the right basal ganglia (BGR, 6.0 cm3 ) and frontal
white matter, right side (FWR, 5.0 cm3 ); see Fig. 8.2. A short echo-time Point
RESolved Spectroscopy (PRESS) acquisition sequence (relaxation time/echo
time = 3,000/30 ms, 2.5 min acquisition) was used (Chang et al. 1996), and
metabolite concentrations for five major metabolites were determined as described
162 A. Tamura-Sato et al.

Fig. 8.2 Neonate spectroscopic voxel locations shown on the MPRAGE images in all three
orientations (top left: coronal; bottom left: sagittal; top and bottom right: axial)

previously (Kreis et al. 1993, 2002). For each subject and each time point, a
complete MRS data set was available, which included the concentrations of:
• N-acetyl compounds (NAA): This metabolite is exclusively found in the nervous
system (peripheral and central) and is detected in both grey and white matter. It
is thought to be a marker of neuronal and axonal viability and density. Decreased
concentration of NAA is a sign of neuronal loss or degradation (Cheong et al.
2006; Urenjak et al. 1992, 1993).
• Total creatine (tCR): The role of tCR is to supply energy to all cells in the body,
including brain cells (Cheong et al. 2006).
• Choline-containing compounds (CHO): A marker of nerve signaling, myelin, and
cellular membrane turnover which can also reflect cellular proliferation. Reduced
CHO may be related with delayed myelination or apoptosis (Cheong et al. 2006).
8 Optimization of Multivariate Inverse Mixing Problems with Application to. . . 163

• Myoinositol (MI): This sugar moiety is considered a glial marker because it is


primarily synthesized in glial cells, both in microglia and in astrocyglia cells.
Elevated MI occurs with proliferation of glial cells or with increased glial cell
size, as found in inflammation, and may reflect glial activation accompanying
neuronal dysfunction or loss. MI is also thought to play an important role with
its high concentration in normal fetal brain development (Blüml et al. 2013).
• Glutamate+glutamine (GLX): The GLX signal represents a combination of
glutamate (GLU) and glutamine (GLN), but is dominated by GLU, which is an
important excitatory neurotransmitter found throughout the brain (Mangia et al.
2012).

8.4 MRS Findings

Table 8.1 and Fig. 8.3 show the mean values and standard deviation for the
metabolite concentrations in the basal ganglia, right side (BGR) and data for the
frontal white matter, right side (FWR) is also provided in Table 8.1. We note that
with the exception of [MI] in visits II and III, the metabolite concentrations are
higher in the BGR compared to the FWR. We can compare the measurements from
subjects in Visit I to those obtained in Kreis et al. (2002) for full term neonates
(38 weeks < GA < 43 weeks). In Kreis et al. (2002), all metabolite concentrations
in the ROI placed in the centrum semiovale for developing white matter (based on 11
subjects) were higher compared to our values in the frontal white matter (based on
nine subjects). Indeed, in their paper they obtained [NAA]:3:5˙0:5, [tCR]:4:8˙0:6,
[CHO]: 2:3 ˙ 0:1, [mI]:5:9 ˙ 0:7 and [GLX]:6:3 ˙ 1:1.

Table 8.1 Mean and standard deviation of metabolic concentrations in the two regions of interest
by visit
Visit I
Region [NAA] [tCR] [CHO] [MI] [GLX]
BGR 4:23 ˙ 0:36 5:17 ˙ 0:58 2:19 ˙ 0:13 4:80 ˙ 0:39 7:80 ˙ 1:10
FWR 2:85 ˙ 0:33 3:10 ˙ 0:24 1:83 ˙ 0:21 4:54 ˙ 0:49 5:90 ˙ 0:85
Visit II
Region [NAA] [tCR] [CHO] [MI] [GLX]
BGR 4:85 ˙ 0:19 5:59 ˙ 0:28 1:90 ˙ 0:18 4:00 ˙ 0:64 7:96 ˙ 1:15
FWR 3:71 ˙ 0:52 3:40 ˙ 0:52 1:62 ˙ 0:21 4:06 ˙ 0:71 5:95 ˙ 0:78
Visit III
Region [NAA] [tCR] [CHO] [MI] [GLX]
BGR 5:06 ˙ 0:51 5:62 ˙ 0:55 1:73 ˙ 0:22 3:13 ˙ 0:60 8:01 ˙ 1:03
FWR 4:70 ˙ 0:49 3:76 ˙ 0:58 1:53 ˙ 0:17 3:28 ˙ 0:70 7:00 ˙ 0:77
164 A. Tamura-Sato et al.

Fig. 8.3 Metabolite Concentrations in the BGR Region. Nine subjects were each sceanned three
times. Visit I was scanned between 39.6 and 41.1 weeks (postmenstrual age). Visit II was scanned
between 43.9 and 46.7 weeks. Visit III was scanned between 48.0 and 52.7 weeks

8.4.1 Regional Variations and Age Dependence

Graphs for each of the five metabolite concentrations over time for the two ROIs are
displayed in Fig. 8.4. The average rate of growth for each metabolite on the graph
is shown in Table 8.2. There is a greater increase over time in [NAA], [tCR], and
[GLX] levels in the FWR compared to the BGR, and a greater decrease in [CHO]
and [MI] levels in the BGR compared to the FWR. We note the smallest difference in
growth between the two regions is for [CHO], and the greatest difference in growth
rate is for [GLX].
A two-factor (age and region) repeated measures ANOVA analysis was con-
ducted on the metabolite data (Table 8.3). All metabolites except CHO had a
significant regional dependence, and all metabolites except GLX had significant age
dependence. Only NAA showed a significant interaction effect between region and
age.
8 Optimization of Multivariate Inverse Mixing Problems with Application to. . . 165

Fig. 8.4 Graphs for each of the five metabolite concentrations over time. Each of the five vertical
planes displays graphs of a specific metabolite concentration vs time for the two ROI. Since a
traditional line of best fit does not take into account the dynamics of the data within a subject
over time, we use a new fitting procedure. We calculate the slope between sequential data points
(subjects with three measurements in the same ROI have two pairs of sequential points). We set
an initial point using the average age and average metabolite concentration of Visit I scans. To
create the fit, we use a variation of Euler’s method: taking a small step along the x-axis (age), and
using the average of the calculated slopes at that age to find the change in the y-value (metabolite
concentration), then repeating the process to create a piecewise linear function that fits the data

Table 8.2 Metabolite concentration growth rates. Average rate of change for metabolite concen-
tration vs age in each region [mM/week]
NAA tCR CHO MI GLX
BGR 0:0701 0:0378 0:0403 0:1731 0:0116
FWR 0:1862 0:0715 0:0263 0:1389 0:1486

Table 8.3 Repeated measures 2-Factor ANOVA, p values


Metabolite [NAA] [tCR] [CHO] [MI] [GLX]
Region <0.001 <0.001 <0.001 0.9098 <0.001
Age <0.001 0.001 <0.001 <0.001 0.0538
Region  age 0.0042 0.5950 0.4671 0.5711 0.1393

8.5 Cell Types and Distribution in Each Brain Region

We apply our methodology to a simple model with two populations, differentiated


based on some of their metabolite characteristics, and will calculate their relative
cellular density in the BGR and FWR as a function of age. The term “relative cellular
density” is rather vague at this stage, discussion on a possible interpretation for our
model is included in the results section.
166 A. Tamura-Sato et al.

Since NAA and MI are theorized to be markers for neurons and glia (Brand
et al. 1993 and Guimaraes et al. 1995), this supports an approach with these two
populations of cells. We will call them type I and type II, respectively. Cells of type
I are characterized by a negligible concentration of NAA, a common assumption for
glia, and cells of type II are characterized by negligible MI concentration, a common
assumption for neurons.
We have a total of m D 27 measurements for the BGR and m D 27 measurements
for the FWR. In our model, each measurement is treated independently even though
it might represent a repeat measurement in a given subject. Based on our analysis
of metabolite concentrations in Sect. 8.4 and Table 8.3, we will consider region-
specific composition matrices. Thus, each region will have its own composition
matrix A. The methodology for setting up the model for each region is essentially
the same, however.
The matrix C is a 5  m matrix as follows. Each column of C corresponds
to one measurement, and therefore contains five values: one for each metabolite
concentration. We order the concentrations as follows: [NAA], [tCR], [CHO], [MI],
[GLX]; therefore, 11 reflects the NAA concentration per unit of density in type I
cells. Because of the exclusivity of the NAA and MI markers, we set 11 D 0 and
4
2 D 0, and can therefore remove them as variables.
Since n D 2, the composition matrix A is 2  5. Since negative values would
j
be unrealistic, we add the non-negative constraints that i  0 and xki  0 for
each i; j; k. We can compute explicitly the elements of the residual matrix E D
.ei;j . //1i5;1jm defined by Eq. (8.2)

5
" 5
! 5
!#
X cj;k X X
e . /D
i;j k
1
i
1  i
2
l l
1 2 C k
2  i
1
l l
1 2 C i
2  ci;j ;
kD1
det.AT A/ lD1 lD1
(8.8)
As in Eq. (8.3), our optimization problem becomes:
X
.e% .y//2
i;j
min f .y/; f .y/ D (8.9)
y
i;j

subject to the constraints:

4
X
h1 .y/ D .yi /2  1 D 0 (8.10)
iD1

8
X
h2 .y/ D .yi /2  1 D 0 (8.11)
iD5

gi .y/ D yi  0; i D 1; 8 (8.12)
8 Optimization of Multivariate Inverse Mixing Problems with Application to. . . 167

5 5
!
1 X X
G1;j .y/ D c k;j a 1k  a 2k a1m a2m 0 (8.13)
det .AT A/ kD1 mD1
5 5
!
1 X X
G2;j .y/ D c k;j a 2k  a 1k a1m a2m 0 (8.14)
det .AT A/ kD1 mD1

where Eqs. (8.10)–(8.11) are the L2 -normalization for each column of A, Eq. (8.12)
is the non-negativity constraint on elements of A, and Eqs. (8.13)–(8.14) are the
non-negativity constraints on cell density.
We solve the above optimization problem numerically using MATLAB 2010b.
The results of our numerical calculations can be found in Table 8.4.
The Lagrangian multipliers were calculated numerically for all metabolite
spectra matrices in Table 8.4 as well as the Hessian of the Lagrangian using Matlab’s
fmincon solver. The Hessians were found to be strictly positive definite, therefore
satisfying the sufficient condition for local minima. We then checked for global
optima with MATLAB’s GlobalSearch. Across all regions and all starting points
chosen, the GlobalSearch solver reported the same minima and these minima were
consistent with the local minima found by the fmincon solver.
Inspection of optimized regional A values in Table 8.4 (see also Fig. 8.5) shows
that the metabolite concentration per unit of cellular density is similar for [CHO] in
both type I and type II cells for both regions examined. [tCR] is higher in the FWR

Table 8.4 Optimized values Ex: Aregion


for the region-specific
Type I Type II ABGR AFWR
composition matrices A NAA
1
NAA
2 0 0:46589 0 0:61583
tCR
1
tCR
2 0:12945 0:48874 0:25013 0:3628
CHO
1
CHO
2 0:23189 0:0925 0:22591 0:08927
MI
1
MI
2 0:95397 0 0:83338 0
GLX
1
GLX
2 0:13936 0:7318 0:43803 0:69366

Fig. 8.5 Metabolite concentration per unit cell density for each region. The left graph shows the
concentrations per unit type I density, and the right graph shows the concentrations per unit type
II density. Note the NAA concentrations for type I cells and the MI concentrations for type II are
zero across both regions due to constraints
168 A. Tamura-Sato et al.

than the BGR in type I cells, but in type II cells [tCR] levels are higher in BGR than
FWR. The neuronal marker [NAA] has a higher value per unit of cellular density in
the frontal region compared to the deep grey matter, while the glial marker [MI] is
higher in the BGR compared to the FWR, suggesting a greater dependence on each
marker for their respective cell type. [GLX] has a substantially higher concentration
per unit density in the FWR compared to BGR for type I cells, whereas in type II
cells BGR has a slightly greater [GLX] value than FWR.

8.5.1 Cellular Density

The matrix of relative cellular density X can now be computed from the equality
X D A C where A represents the Moore-Penrose pseudo-inverse regional matrices
(see Table 8.5). From this Table it is clear that for the cells of type I, almost all the
information is embedded in the MI concentration. For the cells of type II, however,
the interplay between the various metabolites is more pronounced but dominated by
the concentrations of NAA and GLX.
The results show a general decrease in density of type I cells over time, and an
increase in density of type II cells. The frontal region shows a greater increase in
density for type II cells than the basal ganglia, see Fig. 8.6 and Table 8.6. Repeated

Table 8.5 Pseudo-inverse regional matrices, where A represents the Moore-Penrose pseudo-
inverse matrix of matrix A
 
ABGR AFWR
-0.09012 0.03959 0.22237 0.98842 0.00283 -0.30849 0.12036 0.22813 1.00653 0.18156
0.48271 0.48135 0.05099 -0.18454 0.73127 0.74378 0.31288 -0.0053 -0.41747 0.61835

Fig. 8.6 Graphs of density levels for type I and type II vs age

Table 8.6 Average rate of BGR FWR


growth for cell types I and II
Type I 0.195 0.172
Type II 0.084 0.283
8 Optimization of Multivariate Inverse Mixing Problems with Application to. . . 169

Table 8.7 p values of repeated measure 2-Factor ANOVA of density levels


Type I Type II
Region 0:00240 2:678  107
Age 1:045  106 8:683  106
Region x age 0.8840 0.0134

Table 8.8 Mean and standard deviation of control residual values, in percent
Residual values region [NAA] [tCR] [CHO] [MI] [GLX]
BGR 0:11 ˙ 9:6 0:37 ˙ 7:8 1:1 ˙ 11 0:19 ˙ 1:6 1:0 ˙ 7:0
FWR 0:033 ˙ 7:7 0:096 ˙ 7:7 1:5 ˙ 13 0:37 ˙ 3:2 0:75 ˙ 5:4

measures 2-factor ANOVA on the density levels shows both populations have a
significant dependence on region and age, but only type II cells have a significant
interaction effect on region and age, see Table 8.7.

8.5.2 Residuals

Finally, to validate our model with the specific A and X matrices obtained, we
can recover C D AX and compare it with the measured data C. Recall that our
optimization process seeks to minimize the sum of the squares of the percentage
differences between our calculated C values and the metabolite data from the MRS
experiments C. Table 8.8 shows the mean and standard deviation of the residuals.
The measured data C are subject to error due to noise in the MRS experiments, so
we expect some error in our results. The standard deviations of residual values are
approximately 10 % or less, which is consistent with typical errors associated with
in vivo MRS metabolite levels.
When the methodology is applied to all of the data together, rather than separately
for each region, the residuals are larger. Thus, the region-specific A matrices give a
more accurate representation.

8.5.3 Interpretation of Results

Our study demonstrates that alterations in brain metabolite profiles on 1H MRS


during early brain maturation can be represented as a multivariate mixing problem
with only two sources (cell types). Type I cells are designed to have no NAA,
and have strong contributions of MI. Therefore, type I cells most likely represent
the glial population. Interestingly, GLX had a substantial contribution to type
I (glial) compartment in the frontal white matter, but not in the basal ganglia.
170 A. Tamura-Sato et al.

Conversely, type II cells are designed to have no MI, and were found to have similar
contributions of NAA, tCR, and GLX. Therefore, type II cells most likely represent
the neuronal population.
Importantly, over the age range evaluated, a simple 2-source model is able to
explain most of the variance in the metabolite data, with residuals that approach
typical errors associated with in vivo MRS measurements (approximately 10 %).
Furthermore, a 3-source model did not result in substantial improvements in fitting
accuracy, suggesting that a glial and neuronal compartment are sufficient for
representing the measured 1H MRS data within experimental errors.
Assuming the type I compartment represents glial cells and the type II com-
partment represents the neuronal compartment, our findings are in agreement with
those of prior studies. Specifically, both regions examined showed a pronounced
increase in the neuronal compartment with age, probably representing neuronal
maturation during the first few months of life. Of note, the neuronal compartment
of the frontal white matter increased at over three times the rate compared to that
of the basal ganglia. This suggests that the basal ganglia are more mature at birth
compared to white matter, in agreement with the literature (Kostović and Jovanov-
Milošević 2006). In parallel, there is a substantial decrease in the glial (type I)
compartment over the first few months of life, perhaps due to replacement of the
glial cell compartment with maturing neurons or osmotic effects.

8.5.4 Comparison to PCA Analysis

We used MATLAB 2010b’s princomp tool to conduct a principal component


analysis (PCA) on the MRS data to compare with our results. The results are shown
in Table 8.9 and Fig. 8.7. The percentage of variance for the basal ganglia region
explained by each principal component, in decreasing order, is 47.4 %, 36.0 %,
10.6 %, 4.4 %, and 1.6 %. For the FWR region, it is 68.1 %, 20.5 %, 8.1 %, 2.6 %,
and 0.7 %. Approximately 80–90 % of the variance can be explained by the first two
components in both regions. Of note, most of the variance occurs in NAA, GLX,
and MI.
There are some major differences between our model and PCA. First, we note
that the principal components cannot be used to represent the composition of our

Table 8.9 Coefficients of first and second principal components in BGR and FWR
BGR FWR
Metabolite Comp. 1 Comp. 2 Metabolite Comp. 1 Comp. 2
NAA 0.18166 0.33287 NAA 0.62662 0.03408
Cr 0.2051 0.16135 Cr 0.30277 0.28121
CHO 0.01854 0.15223 CHO 0.02391 0.14161
MI 0.27759 0.84329 MI 0.37771 0.84766
GLX 0.92062 0.35897 GLX 0.61028 0.42566
8 Optimization of Multivariate Inverse Mixing Problems with Application to. . . 171

1 1
MI MI

0.5 0.5
GLX GLX
Cr
Component 2

Component 2
CHO CHO
0 0 NAA
Cr
NAA

-0.5 -0.5

-1 -1
-1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1
Component 1 Component 1

BGR FWR

Fig. 8.7 Reduction of system to two PCA Components for BGR and FWR. Contribution of five
measured metabolites to each component shown as vectors

Table 8.10 Calculated values for A without non-negativity constraints and zero constraints
BGR FWR
Metabolite Type I Type II Metabolite Type I Type II
NAA 0:02263 0:46374 NAA 0:16187 0:61143
Cr 0:15187 0:48950 Cr 0:17478 0:36445
CHO 0:23401 0:09748 CHO 0:22052 0:09245
MI 0:94423 0:02221 MI 0:90005 0:01410
GLX 0:17348 0:73167 GLX 0:29074 0:69612

sources (matrix A in the model), since negative composition values are unrealistic.
For instance, the potential glial principal component of the BGR region, having a
high loading of MI (Table 8.9), would also have a substantial negative contribution
.0:33/ of NAA, which is physically impossible in a mixing situation. Conversely,
the calculated A matrices from our methodology produce results that are non-
negative, and the residual values using the model results are smaller than the PCA
results.
The principal components in PCA are calculated to optimally explain variance in
the data. Our calculated A matrices are chosen to minimize residual values, but still
explain most of the variance in the data. In the BGR, type I and type II compositions
(the first and second columns of A) explain 30:1 % and 37:6 % of the variance,
respectively. In the FWR, type I and type II compositions explain 20:4 % and 60:4 %
of the variance, respectively.
If the non-negative constraints and zero constraints on NAA and MI are removed
from our analysis, we obtain the solutions in Table 8.10. Comparing the first and
second PCA components in Table 8.9 to the Type II and Type I results, respectively,
for the model with reduced constraints, the FWR results are similar, while the BGR
results are less comparable.
172 A. Tamura-Sato et al.

8.6 Conclusion

In this paper, we have applied a new methodology to solve an inverse mixing


problem when source composition is largely unknown or uncertain. The model
allows us to calculate an optimal solution in terms of residual values, while
incorporating known information or constraints on source composition.
In comparison to other techniques that are used to solve inverse mixing problems,
the model presented has advantages in specific situations. One major strength of this
model is that the composition of the sources does not need to be known. This is
useful in situations where such measurements are infeasible or when experimental
measurements are found to be insufficient or suspect. The model can therefore
be used to create an “ideal” composition matrix. This “ideal” matrix has the
further benefit of being reported in terms of the components of interest, unlike
PCA approaches which tend to determine composition in terms of the new basis
determined by the PCA analysis. This makes understanding the composition matrix
simpler, as the contribution of each source to the final product is known. Thus this
“ideal” matrix is easy to compare with known data, if any exist. The model also
has the advantage of allowing both equality and inequality constraints. Even if one
does not know the exact composition of the sources, the constraints allow one to
incorporate what information is known about the situation. They can be used to
ensure certain proportions of components, or the absence of certain components in
some sources, or upper or lower bounds of components. Finally, the optimization
approach used gives a best fit to data, even when a significant amount of random
error is present due to noise.
In ongoing work, we are expanding our analysis to MRS data gathered in
additional brain regions and in a much larger sample size of normally developing
neonates. We are also applying this novel method to study neonates whose mothers
smoked tobacco cigarettes or used illicit drugs during their pregnancies in order to
compare the brain development between drug-exposed and non-exposed neonates.
This new analysis method can be applied to all MRS studies, especially those
involving the same brain metabolites evaluated here.

Acknowledgements We are grateful to the research participants in this study. We also thank all
of our clinical and technical research staff who helped with the data collection (S. Buchthal, A.
Hernandez, E. Cunningham, H. Johansen, J. Skranes, R. Yamakawa).

Funding: This work was supported by the National Institute on Drug Abuse (K24-DA016170;
K02-DA016991; 1R01 DA021146), the National Institute on Minority Health and Health Dis-
parities (8G12-MD007601-27), the National Institute of Neurological Diseases and Stroke (U54-
NS056883) and the Office of National Drug Control Policy. M. Chyba and A. Tamura-Sato were
partially supported by the National Science Foundation (NSF) Division of Graduate Education,
award #0841223, and the NSF Division of Mathematical Sciences, award #1109937.
8 Optimization of Multivariate Inverse Mixing Problems with Application to. . . 173

References

Ancora G, Soffritti S, Lodi R, Tonon C, Grandi S, Locatelli C, Nardi L, Bisacchi N, Testa C, Tani
G, Ambrosetto P, Faldella G (2010) A combined a-EEG and MR spectroscopy study in term
newborns with hypoxic-ischemic encephalopathy. Brain Dev 32:835–842
Befroy DE, Shulman GI (2011) Magnetic resonance spectroscopy studies of human metabolism.
Diabetes 60(5):1361–1369
Blüml S, Wisnowski JL, Nelson Jr. MD, Paquette L, Gilles FH, Kinney HC, Panigrahy A (2013)
Metabolic Maturation of the Human Brain from Birth Through Adolescece: Insights from In
Vivo Magnetic Resonance Spectroscopy. Cerebral Cortex 23(12):2944–2955
Brand A, Richter-Landsberg C, Leibfritz D (1993) Multinuclear NMR studies on the energy
metabolism of glial and neuronal cells. Dev Neurosci 15:289–298
Byrd RH, Hribar M, Nocedal J (1999) An interior point algorithm for large-scale nonlinear
programming. SIAM J Optim 9(4):877–900
Byrd RH, Gilbert JC, Nocedal J (2000) A trust region method based on interior point techniques
for nonlinear programming. Math Program 89(1):149–185
Cantrell, CA (2008) Technical note: review of methods for linear least-squares fitting of data and
application to atmospheric chemistry problems. Atmos Chem Phys 8:5477–5487
Chang L, Ernst T, Poland RE, Jenden DJ (1996) In vivo proton magnetic resonance spectroscopy
of the normal aging human brain. Life Sci 58(22):2049–2056
Cheong JLY, Cady EB, Penrice J, Wyatt JS, Cox IJ, Robertson NJ (2006) Proton MR spectroscopy
in neonates with perinatal cerebral hypoxic-ischemic injury: metabolite peak-area ratios,
relaxation times, and absolute concentrations. Am J Neuroradiol 27:1546–1554
Christopherson N, Hooper RP (1992) Multivariate analysis of stream water chemical data: the
use of principal components analysis for the end-member mixing problem. Water Resour Res
28(1):99–107
Christopherson N, Neal C, Hooper RP, Vogt RD, Andersen S (1990) Modelling streamwater
chemistry as a mixture of soilwater end-members: a step toward second-generation acidification
models. J Hydrol 116:307–320
Constrained Nonlinear Optimization Algorithms. The MathWorks Inc. Web. Accessed 6
Aug. 2014. http://www.mathworks.com/help/optim/ug/constrained-nonlinear-optimization-
algorithms.html
Guimaraes AR, Schwartz P, Prakash MR, Carr CA, Berger UV, Jenkins BG, Coyle JT, González
RG (1995) Quantitative in vivo 1H nuclear magnetic resonance spectroscopic imaging of
neuronal loss in Rat brain. Neuroscience 69(4):1095–1101
Haseler LJ, Arcinue E, Danielsen ER, Bluml S, Ross BD (1997) Evidence from proton magnetic
resonance spectroscopy for a metabolic cascade of neuronal damage in shaken baby syndrome.
Pediatrics 99(1):4–14
How GlobalSearch and MultiStart Work. The MathWorks Inc. Web. Accessed 6 Aug. 2014. http://
www.mathworks.com/help/gads/how-globalsearch-and-multistart-work.html#bsc9eec
Hunter MA, Takane Y (2002) Constrained principal component analysis: various applications. J
Educ Behav Stat 27(2):105–145
Kirov I, Flaysher L, Fleysher R, Patil V, Liu S, Gonen O (2008) The age dependence of regional
proton metabolites T2 relaxation times in the human brain at 3 Tesla. Magn Reson Med
60(4):790–795
Kostović I, Jovanov-Milošević N (2006) The development of cerebral connections during the first
20–45 weeks’ gestation. Assess Brain Funct Perinat Period 11(6):415–422
Kreis R, Ernst T, Ross BD (1993) Development of the human brain: in vivo quantification of
metabolite and water content with proton magnetic resonance spectroscopy. Magn Reson Med
30(4):424–437
Kreis R, Hofmann L, Kuhlmann B, Boesch C, Bossi E, Hüppi PS (2002) Brain metabolite
composition during early human brain development as measured by quantitative in vivo 1H
magnetic resonance spectroscopy. Magn Reson Med 48(6):949–958
174 A. Tamura-Sato et al.

Manganas L, Zhang X, Li Y, Hazel RD, Smith SD, Wagshul ME, Henn F, Benveniste H, Djurić PM,
Enikolopov G, Maletić-Svatić M (2007) Magnetic resonance spectroscopy identifies neural
progenitor cells in the live human brain. Science 318(5852):980–985
Mangia S, Giove F, DiNuzzo M (2012) Metabolic pathways and activity-dependent modulation of
glutamate concentration in the human brain. Neurochem Res 37(11):2544–2561
Pouwels P, Brockmann K, Kruse B, Wilken B, Wick M, Hanefeld F, Frahm J (1999) Regional age
dependence of human brain metabolites from infancy to adulthood as detected by quantitative
localized proton MRS. Pediatr Res 46:474–474
Snieder R, Trampert J (1999) Inverse problems in geophysics. In: Wirgin A (ed) Wavefield inver-
sion. International centre for mechanical sciences, vol 398. Springer, New York, pp 119–190
Takane Y, Hunter MA (2001) Constrained principal component analysis: a comprehensive theory.
Appl Algebra Eng Commun Comput 12(5):391–419
Urenjak J, Williams SR, Gadian DG, Noble M (1992) Specific expression of N-acetylaspartate
in neurons, oligodendrocyte-type-2 astrocyte progenitors, and immature oligodendrocytes in
vitro. J Neurochem 59(1):55–61
Urenjak J, Williams SR, Gadian DG, Noble M (1993) Proton nuclear magnetic resonance
spectroscopy unambiguously identifies different neural cell types. J Neurosci 13(3):981–989
van Doormaal P, Meiners L, ter Horst H, van der Veere C, Sijens P (2012) The prognostic value
of multivoxel magnetic resonance spectroscopy determined metabolite levels in white and grey
matter brain tissue for adverse outcome in term newborns following perinatal asphyxia. Eur J
Radiol 22(4):772–778
Waltz RA, Morales JL, Nocedal J, Orban D (2006) An interior algorithm for nonlinear optimization
that combines line search and trust region steps. Math Program 107(3):391–408
Wang ZJ, Vigneron DB, Miller SP, Mukherjee P, Charlton NN, Lu Y, Barkovich AJ (2008) Brain
metabolite levels assessed by lactate-edited MR spectroscopy in premature neonateswith and
without pentobarbital sedation. Am J Neuroradiology 29:798–801
Zou H, Hastie T, Tibshirani R (2006) Sparse principal component analysis. J Comput Graph Stat
15(2):265–286
Chapter 9
Exact Regularization, and Its Connections to
Normal Cone Identity and Weak Sharp Minima
in Nonlinear Programming

S. Deng

Abstract The regularization of a nonlinear program is exact if all solutions of the


regularized problem are also solutions of the original problem for all values of
the regularization parameter below some positive threshold. In Deng (Pac J Optim
8(1):27–32, 2012), we show that, for a given nonlinear program, the regularization
is exact if and only if the Lagrangian function of a certain selection problem
has a saddle point, and the regularization parameter threshold is inversely related
to the Lagrange multiplier associated with the saddle point. The results in Deng
(Pac J Optim 8(1):27–32, 2012) not only provide a fresh perspective on exact
regularization but also extend the main results in Friedlander and Tseng (SIAM
J Optim 18:1326–1350, 2007) on a characterization of exact regularization of a
convex program to that of a nonlinear (not necessarily convex) program. In this
paper, we will examine inner-connections among exact regularization, normal cone
identity, and the existence of a weak sharp minimum for certain associated nonlinear
programs. Along the way, we illustrate by examples, how to obtain both new results
and reproduce many existing results from a fresh perspective.

Keywords Saddle point • Lagrangian function • Exact regularization • normal


cone identity • and weak sharp minima

9.1 Introduction

To understand basic ideas of exact regularization, let us begin by considering the


following examples.
Example 9.1. Given an underdetermined system of linear equations

Ax D b;

S. Deng ()
Department of Mathematical Sciences, Northern Illinois University, DeKalb, IL, USA
e-mail: deng@math.niu.edu.

© Springer-Verlag Berlin Heidelberg 2015 175


H. Xu et al. (eds.), Optimization Methods, Theory and Applications,
DOI 10.1007/978-3-662-47044-2_9
176 S. Deng

where A is an m  n matrix, x 2 Rn , and b 2 Rm with m < n, we assume that


the system is consistent. In signal processing, the following linear programming
problem

min jjxjj1 subject to Ax D b (9.1)

has been widely studied.


Why? The problem (9.1) is closely connected to the problem of finding sparse
signal representation,

min jjxjj0 subject to Ax D b;

where jjxjj0 denotes the number of nonzero components in x.


As a linear programming problem, (9.1) has multiple solutions in general. It
is desirable to select a solution with additional properties, e.g. a least two-norm
solution.
Example 9.2. The forward model in many data acquisition scenarios can be
formulated as follows

y D Ax0 C w;

where y 2 Rm are the observations, x0 2 Rn the unknown signal to recover, w


the noise, and A is a linear operator which maps the signal domain Rn into the
observation domain Rm with m  n.
Even when m D n, A is in general ill-conditioned or singular. This makes the
linear inverse problem of finding a good approximation of x0 is ill-posed. A useful
way to deal with this challenge is to solve the following optimization problem
parametrized by ı:

minx2Rn 1=2jjy  Axjj22 C ıR.x/; (9.2)

where R is an appropriate regularization term through which some regularity is


enforced on the recovered signal, and ı > 0.
For noiseless observations, i.e. w = 0, by letting ı # 0, we see that (9.2) is closely
related to the following constrained minimization problem

min R.x/ subject to Ax D y: (9.3)

The role of problem (9.3) is to analyze problem (9.2) through certain regulariza-
tion process (see Definition 9.1 for more details).
The technique of regularization is a common approach used to solve an ill-
posed nonlinear optimization problem with non-unique solutions by constructing
a related problem whose solution is well behaved and deviated only slightly from a
9 Exact Regularization, and Its Connections to Normal Cone Identity and. . . 177

solution of the original problem. Deviations from solutions of the original problem
are generally accepted as a trade-off for obtaining solutions with other desirable
properties. However, it would be more desirable if solutions of the regularized
problem were also solutions of the original problem. In a recent paper by Friedlander
and Tseng (2007), the authors presented a systematic study for exact regularization
of a convex program. The term exact regularization was coined in Friedlander and
Tseng (2007); according to Friedlander and Tseng (2007) the regularization is exact
if the solutions of perturbed problems are also solutions of the original problem
for all values of penalty parameters below some positive threshold value. In Deng
(2012), we demonstrate that the main results of Friedlander and Tseng (2007) can
be extended to non-convex programs thereby the application domain of this exact
regularization technique has been significantly expanded. In this paper, for convex
programs, we examine inner-connections among exact regularization, normal cone
identity, and the existence of a weak sharp minimum for certain associated nonlinear
programs. Specifically, we show that strongly exact regularization is equivalent to
normal cone identity, and weak sharp minima implies normal cone identity. Along
the way, we illustrate by examples, how to obtain both new results and reproduce
many existing results from a fresh perspective.
The notation used in this note is standard. See e.g. Rockafellar and Wets (1998).

9.2 Review of Main Results in Deng (2012)

Examples in the Introduction section suggest to us to consider the following general


nonlinear program

.P/ min g.x/ s:t: x 2 C;

where g W Rn ! R is a continuous function, and C is a closed set in Rn . Let S


be the set of all optimal solutions and suppose that the solution set S of (P) is
nonempty, and denote its optimal value by p . When (P) has multiple solutions
or is very sensitive to data perturbations, a popular way to regularize the problem
is to modify the objective function by adding a new function f . This leads to the
following regularized problem

.P.ı// min g.x/Cıf .x/ s:t: x 2 C;

where f W Rn ! R is a continuous function and ı is a nonnegative regularization


parameter. Let Sı be the set of optimal solutions. For Example 9.1, we have g.x/ D
0, f .x/ D jjxjj1 , and ı D 1, and for Example 9.2, we have g.x/ D 1=2jjy  Axjj22 ,
and f .x/ D R.x/. The regularization function f may be nonlinear, non-convex or
non-differentiable. A popular choice, commonly known as Tikhonov regularization,
of f is jjxjj22 , which can be used to select a least two-norm solution. Another popular
178 S. Deng

choice is l1 regularization with f .x/ D jjxjj1 . More examples of f , applications of


exact regularization of convex programs can be found in Friedlander and Tseng
(2007).
Now we can introduce a key definition of the paper.
Definition 9.1. For (P) with a given function f , we say that regularization is exact
if the solutions of (P.ı/) are also solutions of (P) for all values of ı below some
N that is, Sı  S for all ı  ı.
positive threshold value ı; N

As in Deng (2012) and Friedlander and Tseng (2007), a key to analysis is a


related nonlinear program that selects solutions of (P) of the least f value:

.Q/ min f .x/ s:t x 2 C; g.x/  p ;

where p denotes the optimal value of (P). Let SQ be the set of optimal solutions of
(Q), and suppose that SQ 6D ;. Let the Lagrangian function of (Q) be

L.x; y/ D f .x/ C y.g.x/  p /

for x 2 C and y  0. We say that a pair of vector .Nx; yN / 2 C  RC gives a saddle


point of the Lagrangian L on C  RC if

L.Nx; y/  L.Nx; yN /  L.x; yN / 8x 2 C  Rn ; 8y 2 RC :

Problem (Q) may not have a Lagrange multiplier even for the convex case as
illustrated by the following example.
Example 9.3. Let g.x/ D x2 , C D R, and f .x/ D x. Then S D argminx2C g D f0g.
For L.x; y/ D x C yx2 , there are no saddle points for L over R  RC . In fact, for
y  0,
(
1 if y D 0;
inf L.x; y/ D 1
x2C  4y if y > 0:

A saddle- point condition characterization for (Q) is given in Deng (2012).


Note that this characterization is not true for standard nonlinear programs. A
characterization for standard nonlinear programs can be found in Rockafellar (1993)
Theorem 9.1 (Theorem 3 of Deng (2012)). For problem (Q), a pair .Nx; yN / 2
C  RC is a saddle point of the Lagrangian L if and only if the pair satisfies the
conditions:
(1) xN 2 S;
(2) xN is a minimizer of L. ; yN / over C.
In particular, xN is an optimal solution of (Q).
9 Exact Regularization, and Its Connections to Normal Cone Identity and. . . 179

If (Q) is a convex program, then the existence of Lagrange multiplies for (Q) is
equivalent to the existence of a saddle point for L, and the set of Lagrange multiplies
is the same for any solutions of (Q). Hence Theorem 9.1 generalizes one of main
results in Friedlander and Tseng (2007).
The following theorem generalizes the main results (Theorem 2.1) of Friedlander
and Tseng (2007) on exact regularization to the non-convex case.
Theorem 9.2 (Theorem 6 of Deng (2012)). Consider problems (P), (Q), and
(P.ı/). Then the following statements are true.
(a) For any ı > 0, S \ Sı  SQ .
(b) If there exists a saddle point .Nx; yN / of L for (Q) with xN 2 SQ , then S \ Sı D SQ
for all ı 2 .0; 1=Ny. Here we use the convention 1=Ny D C1 when yN D 0.
(c) If there exists ıN > 0 such that S \ SıN 6D ;, then .Nx; 1=ı/ N is a saddle point of L
for (Q) with any xN 2 S \ Sı D SQ for all ı 2 .0; ı. N
(d) If there exists ıN > 0 such that S \ SıN 6D ;, then Sı  S for all ı 2 .0; ı/.
N

A direct consequence is the following characterization of exact regularization in


terms of saddle point of L.
Corollary 9.3. If there is some yN > 0 such that .Nx; yN / is a saddle point of L, then
xN 2 Sı where ı D 1=Ny. Conversely if Sı \ S 6D ;, then for any xN 2 Sı \ S, .Nx; yN / is a
saddle point of L where yN D 1=ı.
Example 9.4. Let g.x1 ; x2 / D maxfx1 C x2 ; 0g, C D f.x1 ; x2 / j x1  0; x2 
0; x1 C x2  4g, and f .x1 ; x2 / D .x1  4/2  .x2  4/2 . Then the set S of (P)
is the convex hull of the points .0; 0/, .4; 0/ and .2; 2/ and p D 0. Since f is
a concave function, the optimal value of (Q) is achieved at an extreme point of
S due to the concavity of f and the convexity of S. An easy computation shows
that SQ D f.0; 0/g. For any yN  0, .0; 0/ is a minimizer of L. ; yN / over C where
L.x; yN / D f .x/ C yN .g.x/  p /. We conclude that ..0; 0/; yN / is a saddle point of L by
Theorem 9.1.

9.3 Strongly Exact Regularization, Normal Cone Identity,


and Weak Sharp Minima

In this section, we assume that C is a closed convex set, and f ; g are finite convex
functions. We begin with two definitions.
Definition 9.2. We say that strongly exact regularization holds for (P) if for any
f with SQ 6D ;, (P) is exactly regularized with respect to f in the sense of
Definition 9.1.
Definition 9.3. We say that the normal cone identity holds for (P) if, for each x 2 S,

NS .x/ D NC .x/ C RC @g.x/;


180 S. Deng

where @g.x/ is the subdifferential of g at x, and NS .x/; NC .x/ are normal cones of S
and C at x respectively.
The equivalence of strongly exact regularization and normal cone identity
follows.
Theorem 9.4. For (P), strongly exact regularization holds if and only if the normal
cone identity holds for (P).
Proof. ())Let xN 2 S be given. For any v 2 NS .Nx/, let f .x/ D 1=2jjx  .v C xN /jj2 :
Then xN is the unique minimizer for f over S, and rf .Nx/ D v. Since (P) is exactly
regularized with respect to f , by Theorem 9.2, there is some yN  0 such that .Nx; yN / is
a saddle point of L. So

f .x/ C yN .g.x/  p /  f .Nx/ 8x 2 C:

Hence,

rf .Nx/ 2 yN @g.Nx/ C NC .Nx/I

that is, v 2 NC .Nx/ C RC @g.Nx/, which implies that NS .Nx/ NC .Nx/ C RC @g.Nx/. On
the other hand, as S D fx jg.x/  p g \ C, one always has

NS .Nx/ Nfx jg.x/p g .Nx/ C NC .Nx/ RC @g.Nx/ C NC .Nx/:

Thus, the inclusion NS .Nx/ NC .Nx/ C RC @g.Nx/ always holds. Therefore, the normal
cone identity holds.
(() Let f be a finite convex function. Suppose that SQ 6D ;. Then for xN 2 S,

0 2 @f .Nx/ C NS .Nx/:

As NS .Nx/ D RC @g.Nx/ C NC .Nx/;

0 2 @f .Nx/ C RC @g.Nx/ C NC .Nx/:

So there are some v 2 @f .Nx/, w 2 @g.Nx/, and yN  0 such that 0 2 v C yN w C NC .Nx/.


Hence < v C yN w; x  xN > 0 for all x 2 C. Since f and g are convex,

f .x/ C yN g.x/  .f .Nx/ C yN g.Nx// < v C yw; x  xN > 0

for all x 2 C. It follows that, for each x 2 C,

L.x; yN /  L.Nx; yN / D f .x/ C yN .g.x/  p /  f .Nx/  yN .g.Nx/  p /  0:

Since g.Nx/ D p , for any y 2 RC ,

L.Nx; y/ D f .Nx/ C y.g.Nx/  p / D f .Nx/ D f .Nx/ C yN .g.Nx/  p / D L.Nx; yN /:


9 Exact Regularization, and Its Connections to Normal Cone Identity and. . . 181

This shows that .Nx; yN / is a saddle point of L. By Theorem 9.2, exact regularization
holds for (P) with respect to f . Since f is any finite convex function, strongly exact
regularization holds for (P). This completes the proof.
For (P), the normal cone identity property is closely related to the notion
of weak sharp minima, which has found a number of important applications in
mathematical programming. See Burke and Ferris (1993) and Burke and Deng
(2002) and references therein. Recall Burke and Ferris (1993) and Burke and Deng
(2002) that S is said to be a set of weak sharp minima for g over the set C with
modulus ˛ > 0 if

˛dist.x; S/  g.x/  p 8x 2 C;

where dist.x; S/ is the Euclidean distance between x and S.


Theorem 9.5. If S is a set of weak sharp minima for g over C with modulus ˛ > 0,
then the normal cone identity holds for (P)
Proof. Since g is a finite convex function,

@.g C ıC /.x/ D @g.x/ C NC .x/;

where ıC is the indicator function of the set C. By Part 2 of Theorem 2.3 in Burke
and Deng (2002), for any x 2 S,

˛B \ NS .x/ @g.x/ C NC .x/;

where B is the Euclidean unit ball. Then

NS .x/ D RC .˛B \ NS .x// RC .@g.x/ C NC .x// RC @g.x/ C NC .x/:

On the other hand, as S D fz jg.z/  p g \ C, one always has

NS .x/ Nfz j g.z/p g .x/ C NC .x/ RC @g.x/ C NC .x/:

This shows that NS .x/ D RC @g.x/ C NC .x/; that is, the normal cone identity holds.
This completes the proof.
We conclude the paper with an example which shows that the normal cone
identity holding for (P) does not imply that S is a set of weak sharp minima for
g over C in general.
Example 9.5. For (P), let g W R3 ! R be given by g.x/ D x3 , and C D
Œ0; 13 \ .\1 3 2
kD2 C /; where C D fx 2 R j x1  .k  1/x2  k x3  1=kg: Then the
k k

optimal value p D 0 and S D fx 2 C j x3 D 0g. It was shown in the Appendix of
Friedlander and Tseng (2007) that S is not a set of weak sharp minima for g over C,
and that for any given z 2 Rn and f .x/ D 1=2jjx  zjj2, exact regularization holds for
182 S. Deng

(P) with respect to f . The proof of necessary part of Theorem 9.4 shows that this
implies the normal cone identity holds; that is, NS .x/ D NC .x/ C RC @g.x/ for each
x 2 S.

Acknowledgements We wish to thank the referees for their useful comments, which helped us
improve the presentation of the paper.

References

Burke JV, Ferris M (1993) Weak sharp minima in mathematical programming. SIAM J Control
Optim 31:1340–1359
Burke JV, Deng S (2002) Weak sharp minima revisited, part I: basic theory. Control Cybern
31:439–469
Deng S (2012) A saddle point characterization of exact regularization of non-convex programs.
Pac J Optim 8(1):27–32
Friedlander M, Tseng P (2007) Exact regularization of convex programs. SIAM J Optim
18:1326–1350
Rockafellar RT (1993) Lagrange multipliers and optimality. SIAM Rev 35:183–238
Rockafellar RT, Wets R (1998) Variational analysis. Springer, Berlin/Heidelberg
Chapter 10
The Worst-Case DFT Filter Bank Design with
Subchannel Variations

Lin Jiang, Changzhi Wu, Xiangyu Wang, and Kok Lay Teo

Abstract In this paper, we consider an optimal design of a DFT filter bank


subject to subchannel variation constraints. The design problem is formulated as
a minimax optimization problem. By exploiting the properties of this minimax
optimization problem, we show that it is equivalent to a semi-infinite optimization
problem in which the continuous inequality constraints are only with respect to
frequency. Then, a computational scheme is developed to solve such a semi-infinite
optimization problem. Simulation results show that, for a fixed distortion level, the
aliasing level between different subbands is significantly reduced, in some cases up
to 28 dB, when compared with that obtained by the bi-iterative optimization method
without consideration of the subchannel variations.

10.1 Introduction

Filter banks play an important role in a wide range of signal processing applications
such as echo cancellation (Kellermann 1988), microphone arrays (de Haan et al.
2003), speech enhancement and equalization (Vaidyanathan 1993), as well as image
and speech processing. Owing to their wide range of applications, they have been
extensively studied in the past two decades (Dam et al. 2005; de Haan et al. 2001,
2003; Harteneck et al. 1999; Kellermann 1988; Kha et al. 2009; Mansour 2007;
Nguyen 1994; Sturm 1999; Vaidyanathan 1993; Wilbur et al. 2004; Wu and Teo
2010, 2011; Wu et al. 2008, 2013; Yiu et al. 2004; Zhang et al. 2008). In multirate
digital signal processing, an analysis filter is used to divide the signal to be processed
into subbands. They are then decimated according to the new bandwidth of the
subbands. The decimation process causes aliasing of the subband signals. It is

L. Jiang ()
School of Mathematics, Anhui Normal University, Wuhu, 241000, China
C. Wu • X. Wang
Australasian Joint Research Centre for Building Information Modelling, School of Built
Environment, Curtin University, Perth, WA 6845, Australia
K.L. Teo
School of Mathematics and Statistics, Curtin University, Perth, WA, 6845, Australia

© Springer-Verlag Berlin Heidelberg 2015 183


H. Xu et al. (eds.), Optimization Methods, Theory and Applications,
DOI 10.1007/978-3-662-47044-2_10
184 L. Jiang et al.

possible to cancel this aliasing through the design of a synthesis filter bank in such
a way that the whole multirate chain yields no distortion; the total transfer function
is reduced to a simple delay. This is often referred to as the perfect reconstruction
(PR) property (Harteneck et al. 1999).
However, any filtering operation in the subbands will cause phase and amplitude
changes, thereby altering this property. Thus, aliasing may be caused in the
reconstructed output of the subband adaptive filter. To overcome this problem,
optimization methods are often used in the design of filter banks, where both
the aliasing effect and the distortion levels in the filter bank are optimized. In
de Haan et al. (2003), the design of a uniform discrete Fourier transform (DFT)
filter bank is solved by a two-step optimization problem. In the first step, the
analysis filter bank is designed in such way that the aliasing terms in each subband
are minimized individually, contributing to minimal aliasing at the output without
aliasing cancellation. In the second step, the synthesis filter bank is designed to
match the analysis filter bank where the analysis-synthesis response is optimized
while all aliasing terms in the output signal are individually suppressed, rather than
aiming at aliasing cancellation. Since the analysis and synthesis filter banks are
designed separately, this method does not produce a good result. To improve this
method, a bi-iterative method is used in Dam et al. (2005), where the design of
this filter bank is formulated as a constrained fourth order polynomial optimization
problem with continuous constraints. This optimization problem is hard to solve.
Thus, a bi-iterative computational scheme is incorporated to solve the formulated
optimization problem. More specifically, the analysis filter bank is fixed when
solving the synthesis filter bank, while the synthesis filter bank is fixed when
solving the analysis filter bank. In this way, only quadratic optimization problem
is required to be solved during the design process of this filter bank. Since the
original optimization problem is a fourth order polynomial optimization problem,
this bi-iterative scheme does not provide a global optimal solution. To overcome
this problem, a global optimization method based on the filled function method is
introduced to solve the corresponding optimization problem in Wu et al. (2008).
By using the global method, better results are obtained. In Yiu et al. (2004), the
design problem has been formulated as a multicriteria optimization problem. Then, a
nonlinear programming methods can be used to solve such an optimization problem.
In Wilbur et al. (2004), the design of a generalized DFT filter bank is formulated as a
cone program with a combination of linear, second-order, and semi-definite cones. It
is solved by using an existing convex optimization software package (Sturm 1999).
Although the DFT filter bank has been extensively studied, the actual filtering
operation in each frequency band has only been taken into limited consideration.
In this paper, we propose a new formulation which includes a filtering operation in
each subband in addition to the optimization and control of each individual criterion.
The aliasing effects for the filter bank are minimized subject to the constraints on
the distortion level for all the frequencies. The formulation is such that it includes a
term measuring the deviation from the nominal value in each subband. Comparing
with the earlier formulations, our formulation is more ideal for use in adaptive filters
or speech enhancement applications since the filter operation with distortion in each
10 The Worst-Case DFT Filter Bank Design with Subchannel Variations 185

subband has been taken into consideration. In this problem formulation, the study
is to provide simultaneous optimization on both the analysis and synthesis filter
banks, while maintaining robustness against deviation from unity in each subband.
The advantage of this problem formulation is that it provides less overall distortion
when the subchannels are subject to distortions. This filter bank design problem is
formulated as a constrained minimax optimization problem.
The second contribution of this paper is that we proved that the formu-
lated constrained minimax optimization problem was equivalent to a semi-infinite
optimization problem where the continuous constraints are only with respect to
frequency. Although a minimax optimization problem is actually a semi-infinite
optimization problem, we cannot use available methods for solving semi-infinite
optimization to solve it directly. This is due to the fact that the semi-infinite
optimization problem is known to suffer from the curse of dimensionality (Lopez
and Still 2007). In this paper, we will show that this minimax optimization problem
is equivalent to a standard semi-infinite optimization problem with non-smooth
constraints. Then, an iterative computational scheme is developed to solve this
optimization problem. Some simulation examples are presented to illustrate the
method proposed. Simulation results show that, for a fixed distortion level, the
aliasing between different subbands is significantly reduced, in some cases up to
28 dB, when compared with those obtained by using the bi-iterative optimization
method developed in Dam et al. (2005) which is not taken into the consideration of
the subband variations.

10.2 Analysis and Synthesis Filter Banks

Uniformly modulated filter banks are employed where the filter banks are formed
by modulated versions of the analysis and synthesis prototype filters. Denote h D
Œh.0/; ; h.La  1/T as the prototype filter of length La for the analysis filter
bank with the corresponding transfer function H.z/ D hT a .z/, where a .z/ D
Œ1; z1 ; ; z.La 1/ . Similarly, denote g D Œg.0/; ; g.Ls  1/T as the prototype
filter of length Ls for the synthesis filter bank with the transfer function G.z/ D
gT s .z/, where s .z/ D Œ1; z1 ; ; z.Ls 1/ . For a system with M subbands, the
subband filters Hm .z/ and Gm .z/, 0  m  M  1, are obtained from the prototype
filters H.z/ and G.z/, respectively, as follows:

Hm .z/ D H.zWM
m
/ and Gm .z/ D G.zWM
m
/ (10.1)

where WM D ej2=M . A possible realization of an analysis and synthesis filter bank


is given in Fig. 10.1. The input signal X.z/ is filtered by the analysis filter Hm .z/ and
decimated by a factor D, D  M, according to

1X
D1
Xm .z/ D H.z1=D WM WD /X.z1=D WDd /
m d
(10.2)
D dD0
186 L. Jiang et al.

X (z) X0 ( z) Y0 ( z )
H0 ( z ) ↓D ↑D G0 ( z )

X1 ( z ) Y1 ( z )
H1 ( z ) ↓D ↑D G1 ( z ) +

Y (z)
X M −1 ( z ) YM −1 ( z ) G ( z)
HM−1 ( z) ↓D ↑D M−1 +

Fig. 10.1 Analysis and synthesis filter banks

where WD D ej2=D . Denote m .z/ as an application dependent filtering operation


for the mth subband. The output of the filtering operation for the mth subband then
becomes

Ym .z/ D m .z/Xm .z/: (10.3)

In the synthesis filter bank, the subband signals Ym .z/ are interpolated by a factor D
and then added to form the output signal Y.z/ given by

1X X
D1 M1
Y.z/ D X.zWDd / m .zD /H.zWM WD /G.zWM
m d m
/: (10.4)
D dD0 mD0

P
M1
The term m .zD /H.zWM WD /G.zWM
m d m
/ can be viewed as the transfer function
mD0
which contributes to the aliasing terms in the output signal for 1  d  D  1 and to
the desired output signal for d D 0. Normally, the analysis and synthesis filter banks
are designed for the case m .zD / D 1 as filtering operation is unknown. However,
this performs poorly in real situations when m .zD / can take on arbitrary values.
Thus, the optimization should be performed subject to an allowable variation for
m .zD / D 1 C ım ; where ım is a random variation. Thus the design of this filter bank
should include this variation constraint. Let ı D Œı0 ; ı1 ; ; ıM1 T : We suppose
that the random vector ı is restricted to the following box constrained set
˚
U D ı D Œı0 ; ı1 ; ; ıM1 T 2 RM W jıi j  "i ;

where "i ; i D 0; ; M  1; are small constants.

10.3 Worst-Case Prototype Filter Design

The objective is to optimize the analysis and synthesis prototype filters with respect
to both the aliasing power and the distortion for the filter bank. The aliasing power
for all the aliasing terms in (10.4) for a frequency ! 2 Œ;  is given by
10 The Worst-Case DFT Filter Bank Design with Subchannel Variations 187

1 XX
D1 M1
m 2
A.!/ D j m .zD /H.ej! WM WD /G.ej! WM
m d
/j : (10.5)
D dD1 mD0

This can be rewritten in terms of the analysis and synthesis prototype FIR filter
coefficients as follows:

1 XX
D1 M1
A.!/ D j m .zD /hT ˆm;d .ej! /gj2 (10.6)
D dD1 mD0

where

ˆm;d .ej! / D a .ej! WM WD /sT .ej! WM


m d m
/: (10.7)

Thus, the total aliasing power for all the frequencies ! 2 Œ;  is defined as
Z 
1
.h; g; ı/ D A.!/d!
2 
Z  X M1
D1 X
1
D j m .zD /hT ˆm;d .ej! /gj2 d!
2D  dD1 mD0
Z  X
M1 X
D1
1
D j .1 C ım / hT ˆm;d .ej! /gj2 d!; (10.8)
2D  mD0 dD1

and the analytical form of .h; g; ı/ is given in Appendix 1.


However, we would also like to constrain the distortion on the solution so that the
solution obtained will possess the property of having a tight bound on distortion. As
such, denote d as the desired total delay for the filter bank. The desired frequency
response for the total system is given as Td .ej! / D ej!d : It follows from (10.4)
that the transfer function of the filter bank is

1 X
M1
T.z/ D m .zD /H.zWM
m
/G.zWM
m
/ D hT ‰.z; ı/g; (10.9)
D mD0

where

1 X
M1
‰.z; ı/ D .1 C ım /a .zWM
m
/sT .zWM
m
/: (10.10)
D mD0

Since the analysis and synthesis prototype filters are to be designed subject to an
allowable small distortion from a desired response, the worst case scenario can be
formulated using the mini-max constraint on the total response of the filter bank
below,

jhT ‰.ej! ; ı/g  Td .ej! /j  ; 8! 2 Œ; ; 8ı 2 U ; (10.11)


188 L. Jiang et al.

where  is a specified small error. To proceed further, the modulus con-


straints (10.11) are replaced by their real and imaginary part constraints given
below.
˚
hT Re ‰.ej! ; ı/ g  cos .d !/
p
 = 2; 8! 2 Œ; ; 8ı 2 U ; (10.12)
˚
cos .d !/  hT Re ‰.ej! ; ı/ g
p
 = 2; 8! 2 Œ; ; 8ı 2 U ; (10.13)
˚
hT Im ‰.ej! ; ı/ g C sin .d !/
p
 = 2; 8! 2 Œ; ; 8ı 2 U ; (10.14)
˚
 hT Im ‰.ej! ; ı/ g sin .d !/
p
 = 2; 8! 2 Œ; ; 8ı 2 U ; (10.15)

where Re fzg and Im fzg denote the real part and the imaginary part of z; respectively.
Now the optimal design for the worst-case scenario of this filter bank can be posed
as the following optimization problem:
Problem 10.1.

min max .h; g; ı/ (10.16)


h;g ı2U

subject to the constraints (10.12)–(10.15).

10.4 Solution Strategy

As a result of the presence of the max operator, the cost function (10.16) is non-
ı2U
smooth. Thus, Problem 10.1 cannot be solved by the gradient-based optimization
methods. By introducing a new variable ; Problem 10.1 can be reformulated as the
following semi-infinite optimization problem:
Problem 10.2.

min 
h;g;

subject to the constraints (10.12)–(10.15) and

.h; g; ı/  ; 8ı 2 U : (10.17)
10 The Worst-Case DFT Filter Bank Design with Subchannel Variations 189

As we can see, Problem 10.2 is a semi-infinite optimization problem. However, in


this semi-infinite optimization problem, the argument ı is an M-dimensional vector.
Thus, it is extremely difficult, if not impossible, to solve Problem 10.2 using any
available method for semi-infinite optimization problems. In the following, we show
that Problem 10.2 is equivalent to a semi-infinite optimization problem, where the
continuous inequality constraints are only with respect to w: We first give some
definitions and then obtain the maximizer .h; g; ı/ on U for each fixed h and g:
Convex set: A nonempty set B in Rn is said to be convex if x C .1  / y 2 B
for any x; y 2 B and 0   1.
Convex combination: A convex combination of a finite number of points
P
k P
k
x1 ; ; xk in Rn is a point of the form i xi ; where i  0; and i D 1:
iD1 iD1
Convex hull: Let B be a nonempty subset of Rn : The convex hull of B, which is
denoted as co .B/ ; is defined by
(
X
k
co .B/ D i xi W xi 2 B; i  0;
iD1
)
X
k
and i D 1; k D 1; 2; :
iD1

Extreme point: Let B be a convex set in Rn : If x is an extreme point of B; then


there do not exist two distinct points y; z ¤ x in B such that x is expressed as a
convex combination of y and z:
Define
˚
D D ı D Œı0 ; ı1 ; ; ıM1 T 2 U W jıi j D "i : (10.18)

Clearly, D has 2M distinct extreme points. Let these 2M distinct points be denoted as
ıN 1 ; ; ıN 2M : For the set U , we have the following theorems:
Theorem 10.1. U is a convex set and ıN 1 ; ; ıN 2M are extreme points of U :
Furthermore, U D co ıN 1 ; ; ıN 2M :
Proof. See Appendix 2.
Theorem 10.2. For each given h and g; the maximum of .h; g; ı/ in U is attained
at one of the 2M extreme points ıN 1 ; ; ıN 2M :
Proof. See Appendix 2.
By Theorem 10.2, it is clear that the constraint

.h; g; ı/  ; 8ı 2 U ;
190 L. Jiang et al.

is satisfied if and only if the following 2M constraints are fulfilled.

.h; g; ıN i /  ; i D 1; ; 2M : (10.19)

For notational simplicity, denote


˚
r;m .!; h; g/ D Re hT a .ej! WM
m
/sT .ej! WM
m
/g ;

and
˚
i;m .!; h; g/ D Im hT a .ej! WM
m
/sT .ej! WM
m
/g :

Now, we have the following theorem.


Theorem 10.3.
˚
max hT Re  .ej! ; ı/ g
ı2U

1 X
M1
D . r;m .!; h; g/ C "m j r;m .!; h; g/j/ ; (10.20)
D mD0

˚
min hT Re  .ej! ; ı/ g
ı2U

1 X
M1
D . r;m .!; h; g/  "m j r;m .!; h; g/j/ ; (10.21)
D mD0

˚
max hT Im  .ej! ; ı/ g
ı2U

1 X
M1
D . i;m .!; h; g/ C "m j i;m .!; h; g/j/ ; (10.22)
D mD0

and
˚
min hT Im  .ej! ; ı/ g
ı2U

1 X
M1
D . i;m .!; h; g/  "m j i;m .!; h; g/j/ : (10.23)
D mD0

Proof. See Appendix 2.


The conclusion given in the following theorem follows readily from Theo-
rems 10.2 and 10.3.
10 The Worst-Case DFT Filter Bank Design with Subchannel Variations 191

Theorem 10.4. Problem 10.2 is equivalent to Problem 10.3 which is defined as


follows.
Problem 10.3.

min  (10.24)
h;g;

subject to

.h; g; ıN i /  ; i D 1; ; 2M ; (10.25)

1 X
M1
. r;m .!; h; g/ C "m j r;m .!; h; g/j/
D mD0
p
 cos .d !/  = 2; 8! 2 Œ; ; (10.26)

1 X
M1
 . r;m .!; h; g/  "m j r;m .!; h; g/j/
D mD0
p
C cos .d !/  = 2; 8! 2 Œ; ; (10.27)

1 X
M1
. i;m .!; h; g/ C "m j i;m .!; h; g/j/
D mD0
p
C sin .d !/  = 2; 8! 2 Œ; ; (10.28)

1 X
M1
 . i;m .!; h; g/  "m j i;m .!; h; g/j/
D mD0
p
 sin .d !/  = 2; 8! 2 Œ; : (10.29)

Remark 10.1. Since (10.26) and (10.27) can be re-written as

1 X
M1

r;m .!; h; g/  cos .d !/


D mD0

p 1 X
M1
  2 "m j r;m .!; h; g/j ;
D mD0
192 L. Jiang et al.

and
!
1 X
M1
 r;m .!; h; g/  cos .d !/
D mD0

p 1 X
M1
 = 2  "m j r;m .!; h; g/j ;
D mD0

respectively, we have
ˇ M1 ˇ
ˇ1 X ˇ
ˇ ˇ
ˇ r;m .!; h; g/  cos .d !/ˇ
ˇD ˇ
mD0

p 1 X
M1
 = 2  "m j r;m .!; h; g/j :
D mD0

Thus, a necessary condition for the solvability of Problem 10.1 is that there exits h
and g; such that

1 X p
M1
"m j r;m .!; h; g/j  = 2; 8! 2 Œ; : (10.30)
D mD0

Taking "Q D min f"0 ; "1 ; ; "M1 g and ! D 0 in (10.12) and (10.13), we obtain
ˇM1 ˇ ˇM1 ˇ
 p  "Q ˇˇ X ˇ
ˇ "Q ˇˇ X ˇ
ˇ
"Q 1  = 2  ˇ r;m .0; h; g/ˇ  ˇ r;m .0; h; g/ˇ
D ˇmD0 ˇ Dˇ
mD0
ˇ

"Q X 1 X p
M1 M1
 j r;m .0; h; g/j  "m j r;m .0; h; g/j  = 2: (10.31)
D mD0 D mD0
p
If "Q  2  1; then "Q   by (10.31). This illustrates the relationship between
the disturbance and the tolerance in (10.11). Actually, from our computational
experience,  should be much larger than max f"0 ; "1 ; ; "M1 g :
Let
ˇ M1 ˇ
ˇ1 X ˇ
ˇ ˇ
F1 .!/ D ˇ r;m .!; h; g/  cos .d !/ˇ (10.32)
ˇD ˇ
mD0

1 X
M1
C "m j r;m .!; h; g/j ;
D mD0
10 The Worst-Case DFT Filter Bank Design with Subchannel Variations 193

ˇ M1 ˇ
ˇ1 X ˇ
ˇ ˇ
F2 .!/ D ˇ i;m .!; h; g/ C sin .d !/ˇ (10.33)
ˇD ˇ
mD0

1 X
M1
C "m j i;m .!; h; g/j :
D mD0

Then, according
p to Remark 10.1,pthe constraints (10.26)–(10.29) are equivalent to
F1 .!/  = 2 and F2 .!/  = 2:
Note from Theorem 10.4 that the continuous inequality constraints in Problem
10.3 are only with respect to w: However, some of the constraint functions are
non-smooth since they appear in the form as absolute value functions. Thus,
gradient-based optimization methods cannot be applied directly. In order to remove
this obstacle, we introduce the following smoothing approximation (Teo and Goh
1988):
8
< y; if y  ;
' .y/ D 2 C y2 = .2/ ; if jyj < ; (10.34)
:
y; if y  ;

where  > 0 and y 2 R. Now we replace j r;m .!; h; g/j and j i;m .!; h; g/j
in the continuous inequality constraints (10.26)–(10.29) and obtain the following
optimization problem

min  (10.35)
h;g;

subject to

.h; g; ıN i /  ; i D 1; ; 2M ; (10.36)

1 X
M1

r;m .!; h; g/ C "m ' . r;m .!; h; g// (10.37)


D mD0
p
 cos .d !/  = 2; 8! 2 Œ; ;

1 X
M1
 r;m .!; h; g/  "m ' . r;m .!; h; g// (10.38)
D mD0
p
C cos .d !/  = 2; 8! 2 Œ; ;
194 L. Jiang et al.

1 X
M1

i;m .!; h; g/ C "m ' . i;m .!; h; g// (10.39)


D mD0
p
C sin .d !/  = 2; 8! 2 Œ; ;

1 X
M1
 i;m .!; h; g/  "m ' . i;m .!; h; g// (10.40)
D mD0
p
 sin .d !/  = 2; 8! 2 Œ; :

Let the corresponding problem be referred to as Problem .P /: For any 2 


1 > 0, since

'2 .y/  '1 .y/  jyj ;

it follows that any feasible solution of Problem .P2 / is a feasible solution of


Problem .P1 /: It is also a feasible solution of Problem 10.3. According to Teo
and Goh (1988), the solution of Problem .P / converges to the solution of Problem
10.3 as  ! 0. Furthermore, the feasible region of Problem .P / is increased as  is
decreased. Hence, Problem 10.3 can be solved through solving Problem .P / which
can be solved by many available methods, such as the one given in Lopez and Still
(2007). Now the algorithm is summarized as follows.

Algorithm 10.1
• Step 1: Initialize 1 > 0 and k D 1: Set ıi D 0; i D 0; 1;    ; M  1; and use the bi-iterative
optimization method in Dam et al. (2005) to design the filter bank. Let the solution obtained be
denoted by h 
0 ; g0 .
• Step 2: Solve Problem .Pk / with the initial condition h 
k1 ; gk1 . Let the cost and solution
  
obtained be denoted by k and hk ; gk ; respectively.
• Step 3: Set kC1 D k =L; where L > 1 is a pre-specified number.

• Step 4: If k1  k  κ; where κ > 0 is a prescribed small number, stop. Otherwise, set
k D k C 1, go to Step 2.

There are three parameters 1 ; L and κ in Algorithm 10.1. 1 determines


how close the approximate problem .P1 / is to Problem 10.3. The smaller the
1 ; the closer the approximate problem .P1 / is to Problem 10.3. However, the
constraints become less smooth. L determines the required iteration. The larger
the L; the lesser iterations are required. However, the approximate problem will
become less smooth for smaller k: κ determines the accuracy of the approximation.
The smaller the κ; the more iterations are required. In our simulation, we take
1 D 103 ; L D 10; κ D 106 . Such parameters can achieve a good performance
for most optimization problems judging from our computational experience.
10 The Worst-Case DFT Filter Bank Design with Subchannel Variations 195

10.5 Numerical Examples

In this section, we will use our developed algorithm to design the analysis and syn-
thesis prototype filters with subchannel variations. In the following discussion,  D
102 : Furthermore, the continuous constraints interval Œ;  is discretized into
512 equally spaced frequency points for the optimization Problem .P /. Consider
the case with M D 4; D D 2; " D Œ"0 ; "1 ; "2 ; "4 T D Œ0:001; 0:002; 0:004; 0:004T :
Let La D Ls D 4M C 1; d D 4M.
First, we consider the case without subchannel variations, i.e., ıi D 0; i D
0; 1; ; M  1: For such a filter bank design, the bi-iterative algorithm developed
in Dam et al. (2005) is introduced to design the initial analysis and synthesis
prototype filters. The cost obtained is 167:0414 dB. Let the prototype filters h and
g obtained be collected together and denote as x.0/ : We substitute x.0/ into (10.16)
and find that the largest value of .h; g; ı/ is 90:7572 dB which is obtained at
the extreme point Œ0:001; 0:002; 0:004; 0:004T of U . The maximum violation of
the constraints (10.12)–(10.15) is 0:003973: From these results, we can see that the
aliasing .h; g; ı/ will have a large increase if the subchannels with variations. Thus,
it is necessary to consider the robust optimal filter bank design.
Now, we consider the case of the subchannels with variations. We use the
prototype filters x.0/ obtained by setting ıi D 0 as the initial condition and use
Algorithm 10.1 to optimize Problem .P / with 1 D 103 ; L D 10; κ D
106 . After three iterations, the optimal solution is obtained. The optimal cost is
106:4557 dB which is obtained at the extreme point Œ0:001; 0:002; 0:004; 0:004T
of U . Clearly, there is not only about 16 dB improvement by using the developed
robust optimization method when compared with the result obtained by the bi-
iterative algorithm developed in Dam et al. (2005), but also maintaining the
constraints (10.12)–(10.15) when the subchannels with variations. However, it
should be noted that the problem considered in Dam et al. (2005) is an optimization
problem of a uniform FIR filter bank with group delay specification but without the
consideration of the variations in subchannels. The coefficients of the prototype
analysis and synthesis filters are presented in Table 10.1 and the corresponding
frequency responses are plotted in Figs. 10.2 and 10.3, respectively. The frequency
response of T ej! with ı D Œ0; 0; 0; 0T in (10.9) is plotted in Fig. 10.4 and F1 .w/
and F2 .w/ are plotted in Fig. 10.5. If we take all the "i to be the same, i.e., "i D
0:005: Then the analysis and synthesis prototype filters designed by our method are
plotted in Fig. 10.6 and F1 .!/ and F2 .!/ are depicted in Figs. 10.7 and 10.8. The
cost obtained is 109:7622 dB. Thus, there is about 19 dB improvement by using
our design method. Furthermore, the filter bank design by our method satisfies the
constraints (10.12)–(10.15) for any ı in U since the constraints (10.26)–(10.29)
are satisfied. From Fig. 10.7, we can see that this is not the case if the bi-iterative
algorithm developed in Dam et al. (2005) is used.
Now let M D 8; D D 4; " D Œ0:002; 0:001; 0:002; 0:001; 0:002; 0:001; 0:002;
0:001T . La D Ls D 4M; d D 4M. We use the bi-iterative algorithm in Dam
et al. (2005) to design the initial analysis and synthesis prototype filters with
196 L. Jiang et al.

Table 10.1 The coefficients h g


of the prototype analysis and
synthesis filters with " D 0:002229646148184 0:004295059400797
Œ0:001; 0:002; 0:004; 0:004 T 0:008688331557603 0:005773637435993
0:002888438166081 0:035983297514096
0:023517760474806 0:058922903112834
0:044869395983114 0:024054015169344
0:003348418428374 0:09822147184615
0:126091278937254 0:279096672564763
0:28163363523314 0:44123638388896
0:351454776609997 0:506386191221335
0:279700679819365 0:442412199910485
0:123916445314385 0:28101672864404
0:004236799526732 0:099853376280495
0:044285867995809 0:023793500365253
0:022506213122295 0:060323431058972
0:003492532067841 0:038062212669515
0:008946437740113 0:007266718715655
0:0022092084606 0:003686028054542

10

0
Analysis prototype filter
−10
Frequency response in dB

−20

−30

−40

−50

−60

−70

−80

−90
0 0.2 0.4 0.6 0.8 1
The normalized frequency

Fig. 10.2 The frequency response of the analysis prototype filter with " D Œ0:001; 0:002;
0:004; 0:004T
10 The Worst-Case DFT Filter Bank Design with Subchannel Variations 197

10

−10 Synthesis prototype filter

Frequency response in dB
−20

−30

−40

−50

−60

−70

−80
0 0.2 0.4 0.6 0.8 1

The normalized frequency

Fig. 10.3 The frequency response of the synthesis prototype filter with " D Œ0:001; 0:002;
0:004; 0:004T

−3
x 10
3.5

2.5
Frequency response in dB

1.5

0.5

0
−1 −0.5 0 0.5 1

The normalized frequency

Fig. 10.4 The frequency response of T ejw with ı D Œ0; 0; 0; 0T and " D Œ0:001; 0:002;
0:004; 0:004T
198 L. Jiang et al.

−3
x 10
8

7 F (w)
1
F (w)
2
6

0
−1 −0.5 0 0.5 1

Fig. 10.5 The figures of F1 .!/ and F2 .!/ with " D Œ0:001; 0:002; 0:004; 0:004T

20

0 Analysis prototype filter


Synthesis prototype filter
Frequency response in dB

−20

−40

−60

−80

−100
0 0.2 0.4 0.6 0.8 1
The normalized frequency

Fig. 10.6 The frequency responses of the analysis and synthesis prototype filters with " D
Œ0:005; 0:005; 0:005; 0:005T
10 The Worst-Case DFT Filter Bank Design with Subchannel Variations 199

0.014
F (w)
1
F (w)
2
0.012

0.01

0.008

0.006

0.004

0.002

0
−1 −0.5 0 0.5 1

Fig. 10.7 The figures of F1 .!/ and F2 .!/ by the method in Dam et al. (2005)

−3
x 10
8

F1(w)
1
F2(w)

0
−1 −0.5 0 0.5 1

Fig. 10.8 The figures of F1 .w/ and F2 .w/ with " D Œ0:005; 0:005; 0:005; 0:005T

ıi D 0; i D 0; 1; ; 7. The cost obtained is 105:75 dB. For these two prototype


filters, .h; g; ı/ achieved the largest value 70:6954 dB at the extreme point

Œ0:002; 0:001; 0:002; 0:001; 0:002; 0:001; 0:002; 0:001T

of U : Now we use Algorithm 10.1 to solve Problem 10.3 with the obtained
prototype filters as initial guess. After two iterations, the optimal cost 98:5855 dB
is obtained. The analysis and synthesis prototype filters are plotted in Fig. 10.9. The
200 L. Jiang et al.

20

Analysis prototype filter


Synthesis prototype filter
0

Frequency response in dB
−20

−40

−60

−80

−100
0 0.2 0.4 0.6 0.8 1
The normalized frequency

Fig. 10.9 The frequency response of the synthesis prototype filter with M D 8, D D 4, " D
Œ0:002; 0:001; 0:002; 0:001; 0:002; 0:001; 0:002; 0:001 T

−3
x 10
8
F1(w)
7 F (w)
2

6
Frequency response in dB

0
−1 −0.5 0 0.5 1
The normalized frequency

Fig. 10.10 The figures of F1 .!/ and F2 .!/ with " D Œ0:002; 0:001; 0:002; 0:001; 0:002; 0:001;
0:002; 0:001T

corresponding F1 .!/ and F2 .!/ are plotted in Fig. 10.10. From the results obtained,
we can see that the aliasing can achieve about 28 dB improvement by using our
robust design method.
10 The Worst-Case DFT Filter Bank Design with Subchannel Variations 201

10.6 Conclusions

We have proposed a new formulation of the DFT filter bank problem in which the
subchannel are with variations. Comparing with the earlier formulations, our formu-
lation is more realistic since the filter operation with distortion in each subband has
been taken into consideration. It is in the form of a minimax optimization problem
with continuous inequality constraints. Although this minimax optimization can be
reformulated as a semi-infinite optimization problem by introducing an additional
variable, it is still cannot be solved directly by any existing method for semi-infinite
optimization problems. This is because the continuous constraints are not only with
respect to frequency, but also with respect to variations in subchannels. However, by
exploiting its properties, we proved that such a semi-infinite optimization problem is
equivalent to a semi-infinite optimization problem where the continuous constraints
are only with respect to frequency. Then, an approximate computation scheme is
developed to solve the transformed semi-infinite optimization problem. Simulation
results showed that the new method achieved a very high aliasing suppression while
maintaining the distortion under variations in the different filter bands to be a small
level.

Acknowledgements Changzhi Wu was partially supported by Australian Research Council


Linkage Program, Natural Science Foundation of China (61473326), Natural Science Foundation
of Chongqing (cstc2013jcyjA00029 and cstc2013jjB0149).

Appendix 1

1 XX X D1
X
M1 M1 D1
.h; g; ı/ D .1 C ım / .1 C ın / hT ˆm;n;d;l .g/ h; (10.41)
D mD0 nD0 dD1 lD1

where ˆm;n;d;l .g/ is a La  La matrix. The .i; j/-th element of ˆm;n;d;l .g/ is given by

s 1 L
LX Xs 1 
2
Œˆm;n;d;l .g/i;j D cos .m  n/ .i C t  2/
tD0 sD0
M

2
C .d .i  1/  l .j  1// ı .i C t  j  s/ g .t/ g .s/ ;
D

where ı . / is the delta function, i.e.,

1; if t D 0;
ı .t/ D
0; if t ¤ 0:
202 L. Jiang et al.

Appendix 2

Proof of Theorem 10.1. Clearly, U is a convex set and ıN 1 ; ; ıN 2M are extreme


points of U : It remains to show that U D co ıN 1 ; ; ıN 2M : For any ı D
Œı0 ; ı1 ; ; ıM1 T 2 co ıN 1 ; ; ıN 2M ; there exists i  0; i D 1; ; 2M ; such
P
2M
P
2M
P N
2M
N
that ı D N
i ı i and i D 1: Then, ık D i ıi;k ; where ıi;k denotes the kth
iD1 iD1 iD1
element of ıN i : From (10.18), we have
ˇ ˇ
ˇ 2M ˇ 2M 2
ˇX ˇ X ˇ ˇ X
M

ˇ
jık j D ˇ N ˇ
i ıi;k ˇ 
ˇN ˇ
i ıi;k D i "k D "k :
ˇ iD1 ˇ iD1 iD1

Thus, ı 2 U ; and hence, co ıN 1 ; ; ıN 2M  U . On the other hand, let ı D


Œı0 ; ı1 ; ; ıM1  with jıi j  "i ; i D 0; 1;
T
; M  1: Since jı0 j  "0 ; there
exists a 0 ; 0  0  1; such that ı0 D 0 "0  .1  0 / "0 : Since co ıN 1 ; ; ıN 2M
is convex and

Œı0 ; "1 ; ; "M1 T D 0 Œ"0 ; "1 ; ; "M1 T


C .1  0 / Œ"0 ; "1 ; ; "M1 T ;

we have

Œı0 ; "1 ; ; "M1 T 2 co ıN 1 ; ; ıN 2M :

Since jı1 j  "1 ; there exists a 1 such that ı1 D 1 "1  .1  1 / "1 and 0  1  1:
Thus,

Œı0 ; ı1 ; "2 ; ; "M1 T D 1 Œı0 ; "1 ; "2 ; ; "M1 T


C .1  1 / Œı0 ; "1 ; "2 ; ; "M1 T 2 co ıN 1 ; ; ıN 2M :

Continuing this process, we can show that ı D Œı0 ; ı1 ; ; ıM1 T 2 co ıN 1 ; ; ıN 2M :


N
Hence, U  co ı 1 ; N
; ı 2M : Therefore, U D co ı 1 ;N ; ıN 2M :
Proof of Theorem 10.2. From (10.8), we see that .h; g; ı/ is in quadratic form with
respect to ı. Furthermore, .h; g; ı/  0 for any ı: Thus, we can write .h; g; ı/ in
the form below.

.h; g; ı/ D ı T Qı C qT ıC q;
10 The Worst-Case DFT Filter Bank Design with Subchannel Variations 203

T
where Q D Q1=2 Q1=2 is a semi-positive definite matrix, q and q are correspond-
ing vector and constant. For any ; 0   1, ı 1 and ı 2 ; we have

.h; g; ı 1 / C .1  / .h; g; ı 2 /
 .h; g; ı 1 C .1  / ı 2 /
2 T
D ı T1 Qı 1 C .1  / ı T2 Qı 2  ı 1 Qı 1
 .1  /2 ı T2 Qı 2  2 .1  / ı T1 Qı 2
D .1  / ı T1 Qı 1 C ı T2 Qı 2  2ı T1 Qı 2
h T
D .1  / Q1=2 ı 1 Q1=2 ı 1
T T
i
C Q1=2 ı 2 Q1=2 ı 2  2 Q1=2 ı 1 Q1=2 ı 2

 0:

˚ with respect to ı. Suppose that ı D
Thus, .h; g; ı/ is a convex function
arg max .h; g; ı/ and N D max .h; g; ıN 1 /; ; .h; g; ıN 2M / : From Theo-
ı2U
P
2M
rem 10.1, we know that there exists i  0; 0  i  2M ; with i D 1;
iD1
P
2M
such that ı  D N Since .h; g; ı/ is a convex function with respect to ı, we
iıi :
iD1
have

2
X
M

.h; g; ı / D .h; g; N
i ıi /
iD1

2M
X 2
X
M

 i .h; g; ıN i /  i N D N:
iD1 iD1

Thus, the maximum of .h; g; ı/ in U is attained at one of ıN 1 ; ; ıN 2M :


Proof of Theorem 10.3. Since
˚
hT Re ‰.ej! ; ı/ g

1 X
M1
D .1 C ım / r;m .!; h; g/
D mD0

1 X 1 X
M1 M1
D r;m .!; h; g/ C ım r;m .!; h; g/ :
D mD0 D mD0
204 L. Jiang et al.

Hence,
˚
max hT Re ‰.ej! ; ı/ g
ı2U

1 X 1 X
M1 M1
D r;m .!; h; g/ C max ım r;m .!; h; g/ :
D mD0 ı2U D mD0

For any ı D Œı0 ; ı1 ; ; ıM1 T 2 U , we have


ˇ M1 ˇ
ˇ1 X ˇ
ˇ ˇ
ˇ ım r;m .!; h; g/ ˇ
ˇD ˇ
mD0

1 X
M1
 jım j j r;m .!; h; g/j
D mD0

1 X
M1
 "m j r;m .!; h; g/j :
D mD0

Hence,

1 X 1 X
M1 M1
max ım r;m .w; h; g/  "m j r;m .!; h; g/j : (10.42)
ı2U D mD0 D mD0
h iT
On the other hand, taking ıQ D ıQ0 ; ıQ1 ; ; ıQM1 ; where ıQm D "m r;m .!; h; g/ =
j r;m .!; h; g/j ; yields

1 XQ
M1
ım r;m .!; h; g/
D mD0

1 X
M1
D "m . r;m .!; h; g//2 = j r;m .!; h; g/j
D mD0

1 X
M1
D "m j r;m .!; h; g/j :
D mD0

Combining (10.42) and (10.43), we obtain

1 X 1 X
M1 M1
max ım r;m .!; h; g/ D "m j r;m .!; h; g/j : (10.43)
ı2U D mD0 D mD0

Thus, (10.20) is obtained. The validity of (10.21)–(10.23) can be established


similarly. Thus, the proof is complete.
10 The Worst-Case DFT Filter Bank Design with Subchannel Variations 205

References

Dam HH, Nordholm S, Cantoni A (2005) Uniform FIR filterbank optimization with group delay
specification. IEEE Trans Signal Process 53(11):4249–4260
de Haan JM, Grbic N, Claesson I, Nordholm S (2001) Design of oversampled uniform DFT filter
banks with delay specification using quadratic optimization. In: Proceedings of ICASSP’2001,
Salt Lake City, pp 3633–3636
de Haan JM, Grbic N, Claesson I, Nordholm S (2003) Filter bank design for subband adaptive
microphone arrays. IEEE Trans Speech Audio Process 11(1):14–23
Harteneck M, Weiss S, Stewart RW (1999) Design of near perfect reconstruction oversampled filter
banks for subband adaptive filters. IEEE Trans Circuits Syst 46:1081–1085
Kellermann W (1988) Analysis and design of multirate systems for cancellation of acoustic echoes.
In: Proceedings of ICASSP’88, New York, pp 2570–2573
Kha HH, Tuan HD, Nguyen TQ (2009) Efficient design of cosine-modulated filter banks via convex
optimization. IEEE Trans Signal Process 57(3):966–976
Lopez M, Still G (2007) Semi-infinite programming. Eur J Oper Res 180:491–518
Mansour MF (2007) On the optimization of oversampled DFT filter banks. IEEE Signal Process
Lett 14(6):389–392
Nguyen TQ (1994) Near-perfect-reconstruction pseudo-QMF banks. IEEE Trans Signal Process
42(1):65–75
Sturm JF (1999) Using SeDuMi 1.02, a Matlab toolbox for optimization over symmetric cones.
Optim Methods Softw 11–12:625–653
Teo KL, Goh CJ (1988) On constrained optimization problems with nonsmooth cost functionals.
Appl Math Optim 18:181–190
Vaidyanathan PP (1993) Multirate systems and filter banks. Prentice-Hall, Englewood Cliffs
Wilbur MR, Davidson TN, Reilly JP (2004) Efficient design of oversampled NPR GDFT
filterbanks. IEEE Trans Signal Process 52(7):1947–1963
Wu CZ, Teo KL (2010) A dual parametrization approach to Nyquist filter design. Signal Process
90:3128–3133
Wu CZ, Teo KL (2011) Design of discrete Fourier transform modulated filter bank with sharp
transition band. IET Signal Process 5:433–440
Wu CZ, Teo KL, Rehbock V, Dam HH (2008) Global optimum design of uniform FIR filter bank
with magnitude constraints. IEEE Trans Signal Process 56(11):5478–5486
Wu CZ, Gao D, Teo KL (2013) A direct optimization method for low group delay FIR filter design.
Signal Process 93:1764–1772
Yiu KFC, Grbic N, Nordholm S, Teo KL (2004) Multicriteria design of oversampled uniform DFT
filter banks. IEEE Signal Process Lett 11(6):541–544
Zhang ZJ, Shui PL, Su T (2008) Efficient design of high-complexity cosine modulated filter banks
using 2Mth band conditions. IEEE Trans Signal Process 56(11):5414–5426

You might also like