Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

Communicated by Long Cheng

Accepted Manuscript

Modified discrete iterations for computing the inverse and


pseudoinverse of the time-varying matrix

Marko D. Petković, Predrag S. Stanimirović, Vasilios N. Katsikis

PII: S0925-2312(18)30134-6
DOI: 10.1016/j.neucom.2018.02.005
Reference: NEUCOM 19295

To appear in: Neurocomputing

Received date: 22 July 2017


Revised date: 9 December 2017
Accepted date: 3 February 2018

Please cite this article as: Marko D. Petković, Predrag S. Stanimirović, Vasilios N. Katsikis, Modified
discrete iterations for computing the inverse and pseudoinverse of the time-varying matrix, Neurocom-
puting (2018), doi: 10.1016/j.neucom.2018.02.005

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service
to our customers we are providing this early version of the manuscript. The manuscript will undergo
copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please
note that during the production process errors may be discovered which could affect the content, and
all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT

Modified discrete iterations for computing the inverse


and pseudoinverse of the time-varying matrix

T
Marko D. Petković1 , Predrag S. Stanimirović2 , Vasilios N. Katsikis3 ,

IP
1,2 ∗
University of Niš, Faculty of Sciences and Mathematics, Višegradska 33, 18000 Niš, Serbia
3
Department of Economics, Division of Mathematics and Informatics,

CR
National and Kapodistrian University of Athens, Sofokleous 1 Street, 10559 Athens, Greece

1 2 3
E-mail: dexterofnis@gmail.com, pecko@pmf.ni.ac.rs, vaskatsikis@econ.uoa.gr,

US
Abstract
The general discretization scheme for transforming continuous-time ZNN models for matrix inversion
AN
and pseudoinversion into corresponding discrete-time iterative methods is developed and investigated.
The proposed discrete-time ZNN models incorporate scaled Hyperpower iterative methods as well as
the Newton iteration in certain cases. The general linear Multi-step method is applied in order to
obtain the proposed discretization rule which comprises all previously proposed discretization schemes.
Both the Euler difference rule and the Taylor-type difference rules are included in the general scheme.
M

In particular, the iterative scheme based on the 4th order Adams-Bashforth method is proposed and
numerically compared with other known iterative schemes. In addition, the ZNN model for computing
the time-varying matrix inverse is extended to the singular or rectangular case for the pseudoinverse
computation. Convergence properties of the continuous-time ZNN model in the case of the Moore-
ED

Penrose inverse and its discretization are also considered.


Keywords: Zhang neural network; Inverse matrix; Moore-Penrose inverse; Multi-step methods.
AMS subject classifications: 68T05, 15A09, 65F20
PT

1 Introduction
The hyperpower iterative family for computing generalized inverses has been investigated extensively. The
CE

most important references are [1, 2, 3, 4, 5]. These iterations possess an arbitrary order p ≥ 2 and are
defined by the standard form

  p−1
X
Xk+1 = Xk I + Rk + · · · + Rkp−1 = Xk Rki , Rk = I − AXk , (1.1)
AC

i=0

where A is the input matrix and Xk is the kth iterate of the generalized inverse.
Gradient neural network (GNN) approach is based on the Frobenius norm of an appropriate error matrix
as the performance criterion and exploits a neural network dynamics evolving along the negative gradient-
descent direction to force the convergence of the error norm to zero. For the time-varying case, the Frobenius
norm of the error matrix cannot decrease to zero even after infinite time due to the inability of GNN models
to trace changes of the input matrix in time. For this purpose, the Zhang neural network (ZNN) has been
∗ The first and second author gratefully acknowledge support from the Research Project 174013 of the Serbian Ministry of

Science
† The third author gratefully acknowledges that this work was carried out at the Department of Economics, University of

Athens, Greece, and supported by Special Account Research Grant 13294

1
ACCEPTED MANUSCRIPT

investigated for solving time-varying problems of the matrix inversion and pseudoinversion, starting from
[6]. The design of the ZNN model is based on an indefinite error-monitoring function, called Zhang function
(ZF). ZNN models can effectively eliminate the lagging errors generated by GNN and converge to theoretical
solutions of many mathematical and engineering problems with time-varying coefficients. For example, five
complex-valued ZNN models which are aimed to computation of time-varying complex matrix generalized
inverses were proposed and investigated in [7]. ZNN models for computing time-varying full-rank matrix
pseudoinverse were introduced and analyzed in [8]. A general recurrent neural network model for online

T
inversion of time-varying matrices was presented in [6]. Jin et al in [9] defined ZNN models for solving zero-
finding problems using the projection matrix which includes the existing activation functions as particular

IP
cases. Moreover, a new class of gradient-based RNNs which are able to solve zero-finding dynamic problems
in a way that is independent of errors and matrix inversions were designed in [9].
It is known that the time delay is unavoidable in digital circuits [10]. This fact causes the necessity to

CR
investigate feasible and effective discretization of continuous-time ZNN models [10]. Recently, a number of
researches have aimed at solving online discrete time-varying matrix inverse or pseudoinverse rather than
continuous ones. This approach also differs from the previous research in solving continuous time-varying
matrix pseudoinverse.

US
The scaled Newton method for the usual matrix inversion appears as a result of the Euler-type dis-
cretization of the Zhang Neural Network (ZNN) designed for the inversion of a constant matrix. That
discretization was introduced in [11] based on the Euler difference rule. The main intention of [12] was a
generalization of the result From Zhang neural network to Newton iteration for matrix inversion, derived
AN
in [11], into the more general goal From Zhang neural network to scaled hyperpower iterations for matrix
inversion and vice versa. In [11], it was shown that the Euler-type ZNN discretization of the continuous-
time ZNN model leads to scaled Newton iteration. On the other hand, in [12] it was shown that scaled
hyperpower iterations of arbitrary order, in general, lead to various continuous-time ZNN models. Particu-
M

larly, a ZNN model for the usual matrix inversion arising from the scaled Chebyshev iterations was defined
and investigated in [12].
Recently, numerical computation of generalized inverses of time-varying matrix became a challenging
problem. Many conventional algorithms are not efficient in solving the matrix pseudoinversion problem
ED

in time-varying case, since they do not use information about previous values of the input matrix. In
general, the conventional algorithms assume the short-time invariance of the input matrix and compute
its pseudoinverse at each single time instant [13]. For this purpose, the GNN and ZNN approach become
a powerful alternative to classical iterative algorithms (1.1). On the other hand, numerical algorithms
PT

for finding the matrix pseudoinverse and neural-dynamics methods for solving the time-varying matrix
pseudoinverse are closely related.
Recently, many researchers have proposed discrete-time ZNN models for solving various online time-
varying problems. A number of discrete-time ZNN models, which are expressed by a system of difference
CE

equations, have been proposed and developed for the purpose of possible digital-circuit or digital-computer
implementation. Also, various numerical differentiation rules have been used to discretize continuous-
time ZNN models. Mainly, ZNN discretization has been defined for constant matrix inversion and the
pseudoinversion. According to the observation stated in [10], various forward numerical differentiation
AC

formulas with one step ahead can be considered for the discretization of continuous-time ZNN models. The
simplest approach to discretize the continuous-time ZNN model for approximating the matrix inverse of
a constant matrix is based on the Euler forward difference rule for numerical differentiation [11, 14, 15,
16]. The Euler-type discretization of the ZNN model for time-varying matrix inversion was proposed in
[17]. In addition, Taylor-type numerical differentiation of various orders have been applied in numerical
differentiation, with the aim to achieve better accuracy than the Euler-type discretization. A Taylor-type
numerical differentiation iterative formula for time-varying matrix inversion was proposed in [17]. The
link between Getz-Marsden dynamic system and discrete-time ZNN models was found in [18]. A higher-
order five-step discrete-time ZNN model for time-varying matrix inversion was proposed in [19]. Also,
discrete-time ZNN models for approximating the time-varying matrix pseudoinverse was defined in [13].
The main contributions of the present paper can be summarized as follows.

2
ACCEPTED MANUSCRIPT

(1) Our intention in the present paper is to unify and extend all known Euler-type or Taylor-type dis-
cretizations, derived so far, in a general discrete ZNN models for the inverse computation. The most
general discretization rule which covers all previously proposed discretization schemes is derived by ap-
plying the general linear Multi-step method. In particular, an iterative scheme arising from the 4th order
Adams-Bashforth method, as the particular Multi-step methods, is proposed and considered in details.
(2) Comparison of numerical results derived by various discrete-time iterative schemes for computing the
matrix inverse is presented.

T
(3) The ZNN model for computing the time-varying matrix inverse is extended to the singular or rectangular
case for the pseudoinverse computation (without the full rank assumption) and its convergence properties

IP
are considered. In this way, the results related to the ZNN-5 model for solving the time-varying matrix
pseudoinversion problem from [20] are improved and extended. Properties of the discretization in the
singular case are also considered.

CR
The paper is organized as follows. An overview of known disretizations of various ZNN models is
presented in Section 2. Also, this section describes the main motivation of the paper. In Section 3 we
describe a general disretization scheme of the continuous-time ZNN model for approximating the inverse
of A(t), using some explicit linear Multi-step methods and some higher order approximations of the time

US
derivative of A(t). Possible iterative corrections based on the hyperpower iterative scheme are considered
in the same section. Comparison of various discretized schemes and the hyperpower iterations of orders
2 and 3 on several test matrices is given in Section 4. A ZNN model for generating the Moore-Penrose
inverse is defined and investigated in Section 5. The exponential convergence of the model is verified under
AN
the assumption that the range and null space of the input matrix A(t) are constant for each t > 0. Various
discretizations of the ZNN model for computing the Moore-Penrose inverse and the hyperpower iterations
are compared in Section 6.
M

2 Preliminaries and detailed motivation


Consider the inversion of a time-dependent matrix A(t). This problem is previously studied in many
papers, including Guo and Zhang [18], Jin and Zhang [13], Liao and Zhang [20] etc. The matrix-valued
ED

error function is defined as E(t) = A(t)X(t)−I, where A(t) ∈ Rn×n is an invertible time-varying matrix, I is
n × n identity matrix and X(t) ∈ Rn×n is the unknown matrix. The general dynamic equation is generated
by the pattern Ė(t) = −γH(E(t)), where γ > 0 is the gain parameter and H is always considered as the
elementwise matrix function H(M ) = [h(mij )] defined on a matrix M = [mij ] and h(·) is the differentiable
PT

and odd real function, which is also referred as the activation function. This general pattern produces the
implicit dynamical model
A(t)Ẋ(t) + Ȧ(t)X(t) = −γH(A(t)X(t) − I). (2.1)
CE

Multiplying the left-hand side of (2.1) by A−1 (t) and replacing A−1 (t) by X(t) on the right-hand side of
(2.1) produces the following explicit model for solving the time-varying matrix inverse problem:
Ẋ(t) = −X(t)Ȧ(t)X(t) − γX(t)H(A(t)X(t) − I). (2.2)
AC

This model is originated and considered in [18]. Taking the linear activation function h(x) = x, one obtains
the following linear model
Ẋ(t) = −X(t)Ȧ(t)X(t) − γ(X(t)A(t)X(t) − X(t)). (2.3)
The exponential convergence of this model to the Moore-Penrose inverse A† (t) of the right invertible full-
rank matrix A(t) was proved in [20].
Another nonlinear variant of the model (2.3), considered in [20], is given by
Ẋ(t) = −X(t)Ȧ(t)X(t) − γH(X(t)A(t)X(t) − X(t)) (2.4)
and denoted by ZNN-5 in [20]. It is the only explicit model among five others in that paper. The difference
between (2.2) and (2.4) is only in the position of the matrix activation function H(·).

3
ACCEPTED MANUSCRIPT

Previously defined models (2.2), (2.3) and (2.4) need to be discretized in order to find the numerical
solution for X(t) by an appropriate iterative process. Zhang et al. in [15] defined the first discrete ZNN
(DZNN) model for constant matrix inversion on the basis of the Euler difference rule. Guo and Zhang in
[18] and Zhang et al. in [11] used the Euler difference rules to discretize the time derivatives included in
(2.2), (2.3) or (2.4). Denote the time step (i.e. the sampling gap, as noted in [18]) by τ . According to the
Euler difference rule, derivatives Ẋ(t) and Ȧ(t) are approximated by the expressions

T
Ẋ(t) = (X((k + 1)τ ) − X(kτ ))/τ, Ȧ(t) = (A(kτ ) − A((k − 1)τ ))/τ, (2.5)

IP
respectively. In order to simplify notation, denote by Ak = A(kτ ) and Xk = X(kτ ). Analogous notation
Ȧk = Ȧ(kτ ) and Ẋk = Ẋ(kτ ) is used for analogous values of their time derivatives. If it is assumed that
A(t) and Ȧ(t) are known, (2.2) becomes the ZNN-K model

CR
Xk+1 = Xk − τ Xk Ȧk Xk − βXk H(Ak Xk − I), (2.6)

where β = τ γ. The linear case of (2.6) is given by

US
Xk+1 = Xk − τ Xk Ȧk Xk − βXk (Ak Xk − I),

Larger values of the parameter β may lead to oscillations, while lesser values of β can significantly slow
down the convergence of discrete-time ZNN models [10].
(2.7)
AN
Proposition 2.1. [17] Euler-type DTZNN model (2.7) is consistent and converges with the order of its
truncation error being O(τ 2 ) for all tk ∈ [t0 , tmax ], where O(τ 2 ) denotes a matrix with every element equal
to O(τ 2 ).
M

If Ȧ(t) is unknown, an application of the Euler difference rule (2.5) for approximating Ȧ(t) transforms
the model (2.6) into the following ZNN-U model from [18]:

Xk+1 = Xk − Xk (Ak − Ak−1 )Xk − βXk H(Ak Xk − I), β = τ γ. (2.8)


ED

The scheme (2.8) should be applied respectively for k = 0, 1, . . . where A−1 is taken by A−1 := A0 and
Ȧ(0) = 0 (see [18]).
In the case when A is constant, (2.6) becomes in the linear case exactly the scaled Newton iteration for
PT

computing outer inverses with prescribed range and null space, introduced in [28, 29]:

Xk+1 = (1 + β)Xk − βXk AXk , k = 0, 1, . . . X0 = αG, β ∈ (0, 1], (2.9)


CE

where α, β are real constants and G ∈ Cn×ms is a chosen matrix satisfying 0 < s ≤ r. In the case β = 1
the iterative process (2.9) produces well known generalization of the Schultz iterative method, intended
for computing outer inverses [21, 22]. Therefore, the discretization of the continuous-time ZNN model by
means of the Euler difference rules discovers a closed correlation between the ZNN model and the Newton
AC

iterative rule for computing the matrix inverse.


The disadvantage of the scheme (2.8) is the fact that the Euler difference rule (i.e. first-order Euler
explicit method) becomes inaccurate for large k, especially when the function A(t) oscillates rapidly. That
would produce significant numerical errors in (2.8).
A high accuracy in the discretization of ZNN models was achieved firstly in [19] by means of a Taylor-
type numerical differentiation formula for the first-order derivative approximation. Zhang et al. in [13, 19]
developed a Taylor-type DTZNN model for time-varying matrix inversion by the iterative rule
3 1
Xk+1 = −τ Xk Ȧk Xk − βXk (Ak Xk − I) + Xk − Xk−1 + Xk−2 , β = τ γ. (2.10)
2 2
Also, the following result was proved for this Taylor-type DTZNN model.

4
ACCEPTED MANUSCRIPT

Proposition 2.2. [13, 17] Taylor-type DTZNN model (2.10) is consistent and converges with the order of
its truncation error being O(τ 3 ) for all tk ∈ [t0 , tmax ].
Further, two Taylor-type discrete-time ZNN models for the online time-varying pseudoinversion were
proposed in [13] by adopting the new Taylor-type numerical differentiation formula with Ȧ(t) known or
unknown. These models were termed as T-ZNN-K model and T-ZNN-U model and compared with Euler-
type discrete-time ZNN models, termed as E-ZNN-K model (2.6) and E-ZNN-U model (2.8).

T
A numerical difference rule for the first-order derivative approximation Ẋk with the truncation error of
O(τ 3 ) was established in [19] as

IP
Ẋk = (24Xk+1 − 5Xk − 12Xk−1 − 6Xk−2 − 4Xk−3 + 3Xk−4 ) /(48τ ).

CR
Consequently, Guo et al. in [19] derived a five-step discrete-time DTZNN model for time-varying matrix
inversion and known Ȧ(t), as follows
5 1 1 1 1
Xk+1 = Xk + Xk−1 + Xk−2 + Xk−3 − Xk−4 − 2τ Xk ȦXk − βXk (Ak Xk − I), (2.11)
24 2 4 6 8
where β = 2γτ > 0 ∈ R.
US
Proposition 2.3. [19] The five-step DTZNN model (2.11) is consistent and converges with the order of
its truncation error being O(τ 4 ) for all tk ∈ [t0 , tmax ], where O(τ 4 ) denotes a matrix with every element
AN
being O(τ 4 ).
All previously mentioned schemes are constructed by applying the Euler or Taylor-type approximation
of Ẋ(t) and further of Ȧ(t). We observed that all these schemes are special cases of the general Multi-step
method for solving ordinary differential equations. Owing to this observation, in the next section we define
M

the discretization rule in the general form.

3 General disretization scheme, iterative corrections and higher


ED

order approximations
In this section, we use the general linear Multi-step method in order to obtain the most general discretization
rule. That rule covers all previously proposed schemes, while the new ones arise by choosing the particular
PT

Multi-step methods. In particular, we construct the iterative scheme based on the 4th order Adams-
Bashforth method. We also introduce a new approximations for Ȧ(t) based on the central difference rule,
which performs better than the previously considered ones based on the backward difference rule.
CE

3.1 The new general discrete scheme


Consider the matrix function F(A, D, X) defined by
AC

F(A, D, X) = −XDX − γXH(AX − I),

where A, D ∈ Rm×n , X ∈ Rn×m and H ∈ Rm×m is an appropriate activation function. Accordingly,


the dynamic system (2.2) now becomes Ẋ(t) = F(A(t), Ȧ(t), X(t)). By applying the general form of the
explicit Multi-step method and the general difference rule D(t, A) approximation for Ȧ(t), we get the
iterative method in the most general form

Xk+1 = −α0 Xk − α1 Xk−1 − · · · − αm Xk−m


(3.1)
+ τ (β0 F(Ak , Dk , Xk ) + β1 F(Ak−1 , Dk−1 , Xk−1 ) + · · · + βm F(Ak−m , Dk−m , Xk−m )) .

Here Dk = D(kτ, A) is the corresponding approximation of Ȧk in the time instant t = kτ , while the
real parameters α0 , α1 , . . . , αk−m β0 , β1 , . . . , βk−m are the coefficients of the method. If we denote Fi =

5
ACCEPTED MANUSCRIPT

F(Ai , Di , Xi ) and store Fk , Fk−1 , . . . , Fk−m , the new scheme (3.1) requires only one evaluation of F(A, D, X)
per iteration. It now becomes:

Xk+1 = −α0 Xk − α1 Xk−1 − · · · − αm Xk−m + τ (β0 Fk + β1 Fk−1 + · · · + βm Fk−m ) ,


(3.2)
Fk+1 = F(Ak+1 , Dk+1 , Xk+1 ).

Therefore, (3.2) and (2.6) require the same number of matrix multiplications per iteration (MMI), i.e. the

T
same time complexity in each iteration independently of m. Moreover, by writing

IP
F(A, D, X) = −X(DX + γH(AX − I)) (3.3)

we get that the single computation of F(A, D, X) requires 3 MMI in total. However, assuming the linear

CR
case h(x) = x, one multiplication can be further saved by writing

F(A, D, X) = γX − X(D + γA)X. (3.4)

US
The following are four particular schemes arising from (3.1):

1. Taking m = 0, α0 = −1, β0 = 1 we obtain the generalization of the Euler forward difference rule
based scheme (2.6), which originally appeared in [18]. Denote it by ZNN-E.
AN
2. The choice m = 2, α0 = −3/2, α1 = 1, α2 = −1/2, β0 = 1, β1 = β2 = 0, gives the 3th order scheme
from [17]. That scheme is restated here in (2.10). Denote it by ZNN-T.

3. Values m = 4, α0 = −5/24, α1 = 1/2, α2 = −1/4, α3 = −1/6, α4 = 1/8, β0 = 2, β1 = β2 =


β3 = β4 = 0 are corresponding to the 4th order scheme from [19], restated as (2.11). Denote it by
M

ZNN-T1.

4. Taking m = 3, α0 = −1, α1 = α2 = α3 = 0, β0 = 55/24, β1 = −59/24, β2 = 37/24 and β3 = −9/24,


ED

we obtain the 4th order Adams-Bashforth method based scheme. Denote it by ZNN-AB.

The (local) convergence of particular methods obtained from (3.2) can be easily verified by Dalquist
theorem [23], as it was done in [17] and [13, 19] for ZNN-T and ZNN-T1 respectively. Convergence
of ZNN-AB follows from the well-known fact that all Adams-Bashforth methods are convergent (see for
PT

example [23]). Therefore, the following theorem is valid.

Theorem 3.1. The ZNN-AB model is consistent and converges with the order of its truncation error
equal to O(τ 4 ) for all tk = kτ ∈ [t0 , tmax ], where O(τ 4 ) denotes a matrix with every element being O(τ 4 ).
CE

Further, the order of any particular discretization method that belongs to the class (3.1) depends on
the choices of the parameters.
AC

Remark 3.1. Let us summarize the advantages of the general discrete scheme (3.2) from the computational
perspective.

1. The time complexity per iteration does not depend on the scheme order m. In such a way, newly
introduced scheme ZNN-AB has the same time complexity as all previously introduced schemes.
According to (3.3) and (3.4), it is of the same time complexity since it requires 2 or 3 matrix
multiplications per iteration in the linear and non-linear case, respectively.

2. The usage of the linear Multi-step methods is crucial in order to ensure that each time iteration re-
quires exactly one evaluation of the function F(A, D, X). For example, if one uses the well known 4th
order Runge-Kutta method, it would require total 4 different evaluations of the function F(A, D, X)
per each iteration, which will increase the time complexity of a single iteration.

6
ACCEPTED MANUSCRIPT

3. All previously considered methods, including ZNN-E, ZNN-T and ZNN-T1, are based on the
application of the certain numerical differentiation formulas on the time derivative Ẋ(t). In the
general expression (3.1), it corresponds to the part Xk+1 + α0 Xk + α1 Xk−1 + · · · + αm Xk−m , while
βi = 0 for i ≥ 1. Our approach is general and enables the construction of more efficient and robust
methods. It can be seen by comparing ZNN-AB and ZNN-T1 methods. Both methods have the
order O(τ 4 ), but ZNN-AB is m + 1 = 4-step method, while ZNN-T1 is m + 1 = 5-step method.
The Adams-Bashfort based methods achieve the maximum order for a fixed step size m + 1 [23].

T
IP
3.2 Iterative corrections and the general algorithm
In the contrast to the discretization of the continuous time dynamic system (2.2), one can compute the

CR
approximate inverse Xk+1 of the matrix Ak+1 using any matrix inversion iterative method, initialized by the
previously computed inverse Xk . Using the general hyperpower method of the order r, the corresponding
iterative scheme becomes:
Rk+1 = I − AXk
(3.5)

US 2
Xk+1 = Xk (I + Rk+1 + Rk+1 r−1
+ . . . + Rk+1 ).
This scheme requires in total r MMI. Therefore, in order to get the same number of MMI as in the case
of the schemes introduced in the previous section, one have to consider r = 2 and r = 3. The first case is
Newton/Schultz method, while the second case is the third-order hyperpower method. These schemes will
AN
be further denoted by HP2t and HP3t, respectively.
The iterative method (3.5) can be used also to improve the accuracy of any discretization scheme (3.2),
i.e. particular schemes ZNN-AB, ZNN-T, ZNN-T1 and ZNN-E. In other words, an iterative method
can be further applied on Xk+1 computed by any discretization scheme (3.2). The corrective iterations
M

have the general pattern


(j−1) (j−1)
Rk+1 = I − Ak+1 Xk+1
 ir−1 
ED

h i2 h (3.6)
(j) (j−1) (j−1) (j−1) (j−1)
Xk+1 = Xk+1 I + Rk+1 + Rk+1 + · · · + Rk+1 , j = 1, 2, . . .

(0)
where Xk+1 is given by (3.2). These iterations should be performed until Xk+1 ≈ A−1
k+1 is computed with
PT

(j)
the desired accuracy, i.e. until the norm of Rk+1 falls below the certain threshold.
Algorithm 3.1 unifies all above considerations. Each inner iteration (by j) requires r matrix multipli-
cations, while the outer iteration (by k) requires additional 2 or 3 matrix multiplications, depending on
CE

whether the linear or non-linear activation function is used. The method defined in Algorithm 3.1 should
be initialized by setting X−i for i = −m, −m + 1, . . . , 0. We discuss possible some of initializations in the
next section.
AC

3.3 Approximations of the derivative Ȧk and the initial values


We also need to consider the difference rules Dk used for approximating the time derivative Ȧk . In [18], the
authors used the most simple backward difference rule in the discrete-time algorithm of the ZNN model
with Ȧ(t) unknown:
Ak − Ak−1
Ȧk ≈ Dkb,1 = . (3.7)
τ
The rule (3.7) has the approximation order O(τ 2 ). Furthermore, the following back difference rule with
the approximation order O(τ 3 ) was used in [13]:

1
Ȧk ≈ Dkb,3 = (11Ak − 18Ak−1 + 9Ak−2 − 2Ak−3 ). (3.8)

7
ACCEPTED MANUSCRIPT

Algorithm 3.1 General iterative scheme for computing time–varying matrix inverse
Require:
Time-varying matrix samples Ak = A(kτ ) and derivative estimations Dk , k = −m, −m + 1, . . . , n
Initial inverse estimates X0 , X−1 , . . . , X−m .
Accuracy ε.
The order of the corrective iterative method r.
Maximal number of corrective iterations maxin.

T
1: Compute Fk = F(Ak , Dk , Xk ) = −Xk (Dk Xk + γH(Ak Xk − I)) for k = −m, −m + 1, . . . , 0.
2: for k := 0 to n − 1 do

IP
(0)
3: Xk+1 := −α0 Xk − α1 Xk−1 − · · · − αm Xk−m + τ (β0 Fk + β1 Fk−1 + · · · + βm Fk−m )
(0) (0)
4: Rk+1 := I − Ak+1 Xk+1

CR
5: j := 0
(j)
6: while kRk+1 kF >  and j < maxin do
7: j := j + 1  h i2 h ir−1 
(j) (j−1) (j−1) (j−1) (j−1)
8: Xk+1 := Xk+1 I + Rk+1 + Rk+1 + . . . + Rk+1

9:
10:
11:
(j)
Rk+1 := I − Ak+1 Xk+1
end while
(j)
(j)

Xk+1 := Xk+1 , Rk+1 := Rk+1


(j)
US
AN
12: Fk+1 := F(Ak+1 , Dk+1 , Xk+1 ) = −Xk+1 (Dk+1 Xk+1 + γH(−Rk+1 ))
13: end for
14: return X0 , X1 , . . . , Xn
M

The central difference rules are more accurate, although they has the same approximation order as the
corresponding backward rules. In our testing, we use the following central difference rule
1
Ȧk ≈ Dkc,4 =
ED

(Ak−2 − 8Ak−1 + 8Ak+1 − Ak+2 ) (3.9)


12τ
of the approximation order equal to O(τ 4 ).
Remark 3.2. Note that the central difference rule (c, 4) uses values Ak+1 and Ak+2 which, in general, are
PT

not available at the time point t = kτ . But sometimes A(t) is completely known at the beginning of the
computation. For example, the trajectory of the robotic arm can be either predefined, or at least some
future part of it. In those cases, it is better to use a central rule (for example the rule (c, 4)), since it is
more accurate than a backward rule. Otherwise, if A(t) is not completely known in advance, one must
CE

use the backward rule (for example the rule (b, 3)). That is the reason why we consider and will test both
versions of the new scheme ZNN-AB (i.e. ZNN-ABc,4 and ZNN-ABb,3 ).

In all tests performed in the next section, we combine one of the four considered discretization schemes
AC

(ZNN-E, ZNN-T, ZNN-T1, ZNN-AB) defined above with the corresponding difference rule Dk . The
method meth corresponding to one of the four considered discretization schemes combined with the certain
discretization rule Dkrule will be further denoted by methrule (like ZNN-Eb,1 , ZNN-Tb,3 , ZNN-T1b,3 ,
ZNN-ABc,4 , . . .).
Finally, there are several possibilities for choosing the initial values Xi , i = −m, −m + 1, . . . , 0 for the
general discretization scheme (3.2), as well as for choosing the initial guess X0 for the iterative scheme
(3.5):
1. Take the exact inverses, i.e. Xi = A−1
i for i = −m, −m + 1, . . . , 0. These inverses can be computed
by applying the hyperpower method (3.6) (by taking, for example, r = 2 or r = 3). Although this
choice requires some precomputation, it will be shown that such methods reach their steady state in
much shorter time period. We denote this choice by InitInv.

8
ACCEPTED MANUSCRIPT

2. The second choice is to use randomly generated matrix Xi , i = −m, −m+1, . . . , 0. In [18], the authors
used the combination of this and the first approach, given by Xi = A−1
i + [rand(−e, e)]n×n , where
rand(−e, e) is uniformly distributed random number on the segment [−e, e]. For the comparison
purposes, we also use this choice and denote it by InitInvRand.

4 Numerical examples

T
4.1 No corrective iterations test

IP
In the first test, we compare discretized schemes ZNN-E, ZNN-T, ZNN-T1 and ZNN-AB, together
with the iterative schemes HP2t or HP3t, on several test matrices. The assumption is that no additional

CR
correction iterations are performed on ZNN-E, ZNN-T, ZNN-T1 and ZNN-AB, i.e. that maxin = 0
holds in Algorithm 3.1. Additionally, we assume maxin = 1 for both HP2t and HP3t, i.e. that for
each time point t = kτ , we take one iteration of the hyper power method to compute Xk+1 , initialized by
previously computed value Xk .
All these schemes can be summarized as below:
ZNN-E : Xk+1 = Xk + τ Fk US
ZNN-T : Xk+1 = 3/2 · Xk − Xk−1 + 1/2 · Xk−2 + τ Fk
AN
ZNN-T1 : Xk+1 = 5/24 · Xk + 1/2 · Xk−1 + 1/4 · Xk−2 + 1/6 · Xk−3 − 1/8 · Xk−4 + 2τ Fk
ZNN-AB : Xk+1 = Xk + τ (55/24 · Fk − 59/24 · Fk−1 + 37/24 · Fk−2 − 9/24 · Fk−3 ),
(0)
HP2t : Rk+1 = I − Ak+1 Xk ,
 
M

(0) (0)
Xk+1 = Xk+1 I + Rk+1 ,
(0)
HP3t : Rk+1 = I − Ak+1 Xk ,
 h i2 
ED

(0) (0) (0)


Xk+1 = Xk+1 I + Rk+1 + Rk+1 .

The first four schemes are followed by Fk+1 = F(Ak+1 , Dk+1 , Xk+1 ).
In the first set of examples, the linear activation function h(x) = x is used. It means that F(A, D, X)
PT

is computed by (3.4) using only 2 MMI, which further means that ZNN-E, ZNN-T, ZNN-T1 and
ZNN-AB would require 2 MMI. Therefore, we compare these methods with HP2t, which also requires 2
MMI.
CE

Example 4.1. Consider the following time–varying matrix


 
cos(t) − sin(t)
A1 (t) =
sin(t) cos(t)
AC

which is invertible and its inverse is given by


 
cos(t) sin(t)
A−1
1 (t) = .
− sin(t) cos(t)

Figure 1 shows values of the residual norm kI − Ak Xk kF along the iterations for τ = 0.01 and γ = 1.
The left image is obtained by initializing considered methods with the exact inverse (InitInv), while the
right image is obtained by taking a random perturbation of the exact inverse (InitInvRand, e = 0.5)
as used in [18]. It can be seen that HP2t method almost instantly reaches the steady state, while other
methods are reaching it gradually. The time needed to reach a steady state is significantly larger in the
second case, due to the random perturbation of the initial values.

9
ACCEPTED MANUSCRIPT

|I-Ak Xk | |I-Ak Xk |

10-6 ZNN-ABc,4 10-2 ZNN-ABc,4


ZNN-ABb,3 ZNN-ABb,3
10-9 10-5
ZNN-Tb,3 ZNN-Tb,3
ZNN-T1b,3 ZNN-T1b,3
10-12 10-8
ZNN-Eb,1 ZNN-Eb,1

10-15 HP2t 10-11 HP2t

T
t=kτ t=kτ
1 2 3 4 5 6 5 10 15 20 25 30 35

IP
Figure 1: Residual norm kI − Ak Xk kF along the iterations, for ZNN-ABc,4 , ZNN-ABb,3 , ZNN-Tb,3 ,
ZNN-T1b,3 , ZNN-Eb,1 and HP2t and the matrix A1 (t). Left and right figures are obtained using InitInv

CR
and InitInvRand (e = 0.5) for the initial values respectively.

Surprisingly, the lowest order ZNN-Eb,1 method reaches the best steady-state residual norm among
all other methods. It is also surprising that ZNN-E combined with the larger order Dkc,4 differentiation

US
formula gives many orders of magnitude worse results than combined with the order one Euler Dkb,1
differentiation formula. The following lemma explains this issue.
Lemma 4.1. Assume that Ak = A(kτ ) where A(t) is given as in Example 4.1. Moreover, let X0 = A−1
0 .
AN
Then the sequence Xk is defined by the ZNN-Eb,1 method satisfies Xk = A−1
k for every k ≥ 0.
Proof. The proof goes by mathematical induction. For k = 0 it is already assumed that X0 = A−1
0 , which
means that the initial step holds. Assume that the statement is true for Xk , i.e. that
 
M

−1 cos(kτ ) sin(kτ )
Xk = Ak =
− sin(kτ ) cos(kτ )

or equivalently Ak Xk − I = 0. Therefore
ED

Xk+1 = Xk − Xk (Ak − Ak−1 )Xk − τ γH(Ak Xk − I)


= Xk − Xk (Ak − Ak−1 )Xk
= Xk − Xk Ak Xk + Xk Ak−1 Xk
PT

= Xk Ak−1 Xk .

Using the simple trigonometric manipulations, we obtain


    
CE

cos(kτ ) sin(kτ ) cos((k − 1)t) − sin((k − 1)t) cos(t) sin(t)


Xk Ak−1 = =
− sin(kτ ) cos(kτ ) sin((k − 1)t) cos((k − 1)t) − sin(t) cos(t)

and also   
cos(t) sin(t) cos(kτ ) sin(kτ )
AC

Xk+1 = (Xk Ak−1 )Xk =


− sin(t) cos(t) − sin(kτ ) cos(kτ )
 
cos((k + 1)τ ) sin((k + 1)τ )
= .
− sin((k + 1)τ ) cos((k + 1)τ )
This confirms that the inductive step is true and completes the proof by the mathematical induction.
In other words, once the convergence is established, the ZNN-Eb,1 method provides the exact update
for the inverse matrix A−1
k+1 in this particular case.

Example 4.2. Consider now the slightly modified matrix A1 (t), denoted by A1m (t) and given by
 
cos(t) − sin(t)
A1m (t) =
2 sin(t) cos(t)

10
ACCEPTED MANUSCRIPT

whose inverse is given by


 
1 cos(t) sin(t)
A−1
1m (t) = .
2
2 sin (t) + cos2 (t) −2 sin(t) cos(t)
Figure 2 shows the obtained residual norms for the matrix A1m (t), while the same parameters are used
as in the previous example. It is clear now that the highest order ZNN-ABc,4 method outperforms all
others, while HP2t again reaches steady state almost instantly. The second smallest residual norm is

T
achieved by ZNN-ABb,3 .

IP
|I-Ak Xk | |I-Ak Xk |

0.010

ZNN-ABc,4 0.100 ZNN-ABc,4


0.001
ZNN-ABb,3 ZNN-ABb,3

CR
10-4
ZNN-Tb,3 0.001 ZNN-Tb,3
10-5 ZNN-T1b,3 ZNN-T1b,3

10-6 ZNN-Eb,1 10-5 ZNN-Eb,1


HP2t HP2t
-7
10
10-7

5 10 15 20 25 30 35
t=kτ

US
Figure 2: Residual norm kI − Ak Xk kF along the iterations, for ZNN-ABc,4 , ZNN-ABb,3 , ZNN-Tb,3 ,
ZNN-T1b,3 , ZNN-Eb,1 and HP3t and the matrix A1m (t). Left and right figures are obtained using
5 10 15 20 25
t=kτ
AN
InitInv and InitInvRand (e = 0.5) for the initial values respectively.

Example 4.3. Consider the following matrix from [18]:


 1

M

sin(2t) + 3 2 cos(2t) cos(2t)


A2 (t) =  12 cos(2t) sin(2t) + 3 1
2 cos(2t)
.
1
cos(2t) 2 cos(2t) sin(2t) + 3
Although it is not given in [18], the inverse matrix A−1
ED

2 (t) can be also computed in the closed–form


expression and it is given by
A−1
2 (t) = (219 sin(2t) − 5 sin(6t) + 3 cos(2t) − 54 cos(4t) + cos(6t) + 234)
−1
×
 
−5 cos(4t) + 48 sin(2t) + 75 4 cos(2t)(cos(2t) − sin(2t) − 3) −24 cos(2t)+cos(4t)− 4sin(4t)+1
 4 cos(2t)(cos(2t) − sin(2t) − 3) −8(cos(2t) −sin(2t)− 3)(cos(2t)+sin(2t)+3) 4 cos(2t)(cos(2t) − sin(2t) − 3) .
PT

−24 cos(2t)+cos(4t)− 4 sin(4t)+1 4 cos(2t)(cos(2t) − sin(2t) − 3) −5 cos(4t) + 48 sin(2t) + 75

Figure 3 shows the obtained residual norms for the matrix A2 (t). Note that due to the stability issues,
the random perturbation range e is decreased and equal to e = 0.1. Again, the same conclusion can be
CE

derived as in the previous case. The only difference is that ZNN-T1b,3 is now a bit closer to ZNN-ABb,3
(in a moments even better) than in Example 4.2 .
|I-Ak Xk | |I-Ak Xk |
AC

0.010
0.100
ZNN-ABc,4 ZNN-ABc,4
0.001
b,3
ZNN-AB ZNN-ABb,3
10-4 0.001
ZNN-Tb,3 ZNN-Tb,3
10-5
ZNN-T1b,3 ZNN-T1b,3
10-6 10-5
ZNN-Eb,1 ZNN-Eb,1
10-7 HP2t HP2t
10-8 10-7
t=kτ t=kτ
2 4 6 8 10 12 5 10 15 20 25

Figure 3: Residual norm kI − Ak Xk kF along the iterations, for ZNN-ABc,4 , ZNN-ABb,3 , ZNN-Tb,3 ,
ZNN-T1b,3 , ZNN-Eb,1 and HP2t and the matrix A2 (t). Left and right figures are obtained using InitInv
and InitInvRand (e = 0.1) for the initial values respectively.

11
ACCEPTED MANUSCRIPT

Example 4.4. Let us observe the test matrix AT oep,n (t) = [a|i−j| ]1≤i,j≤n where a0 = n + sin t and
ai = (cos t)/(i − 1). The test matrix of that type was considered in [18]. Assume that n = 4 and that the
remaining parameters are the same as in the previous example.
Figure 4 shows the obtained residual norms and confirms that the same conclusions as in the previous
examples are valid.

T
|I-Ak Xk | |I-Ak Xk |

1
10-3
ZNN-ABc,4

IP
ZNN-ABc,4
ZNN-ABb,3 ZNN-ABb,3
10-5 0.001
ZNN-Tb,3 ZNN-Tb,3
ZNN-T1b,3 ZNN-T1b,3

CR
10-7 ZNN-Eb,1 10-6 ZNN-Eb,1
HP2t HP2t
10-9
10-9
t=kτ t=kτ
1 2 3 4 5 6 5 10 15 20 25 30

US
Figure 4: Residual norm kI − Ak Xk kF along the iterations, for ZNN-ABc,4 , ZNN-ABb,3 , ZNN-Tb,3 ,
ZNN-T1b,3 , ZNN-Eb,1 and HP2t and the matrix AT oep,4 (t). Left and right figures are obtained using
InitInv and InitInvRand (e = 0.1) for the initial values respectively.
AN

Example 4.5. Consider the test matrix in the form Arand,n (t) = [aij cos(t + φij )]1≤i,j≤n + 2I, where I is
M

the n × n identity matrix, while aij and φij are uniformly distributed pseudorandom entries from [0, 1] and
[0, 2π] respectively. The additional term 2I ensures the regularity of the matrix A(t) for every t ∈ [0, 2π].

Figure 5 shows the residual norms for one generated matrix Arand,10 (t). All methods have similar be-
ED

haviour as in the previous examples. However, ZNN-ABb,3 is closer to ZNN-ABc,4 than to ZNN-T1b,3 .
Moreover, ZNN-Tb,3 and ZNN-Eb,1 are below HP2t.

|I-Ak Xk | |I-Ak Xk |
PT

1
0.010
ZNN-ABc,4 ZNN-ABc,4
0.001
ZNN-ABb,3 0.01 ZNN-ABb,3
10-4 ZNN-Tb,3 ZNN-Tb,3
CE

10-5 ZNN-T1b,3 10-4 ZNN-T1b,3

10-6 ZNN-Eb,1 ZNN-Eb,1


HP2t 10-6 HP2t
10-7

10-8
t=kτ t=kτ
5 10 15 5 10 15 20 25
AC

Figure 5: Residual norm kI − Ak Xk kF along the iterations, for ZNN-ABc,4 , ZNN-ABb,3 , ZNN-Tb,3 ,
ZNN-T1b,3 , ZNN-Eb,1 and HP2t and the randomly generated matrix Arand,10 (t) from Example 4.5. Left
and right figures are obtained using InitInv and InitInvRand (e = 0.1) for the initial values respectively.

Finally, we summarize all the above examples by showing the steady state maximal errors for different
methods, matrices and values of τ . The parameter γ remains the same (γ = 1) as in the previous examples.
The minimal values in Table 1 are marked in bold.

Table 1. Steady state maximal errors for different methods, matrices and values of τ for γ = 1.

12
ACCEPTED MANUSCRIPT

Matrix τ ZNN-ABc,4 ZNN-ABb,3 ZNN-Tb,3 ZNN-T1b,3 ZNN-Eb,1 HP2t


0.1 3.81 · 10−5 2.43 · 10−4 6.7 · 10−3 8.79 · 10−5 4.56 · 10−15 1.41 · 10−2
A1m (t) 0.01 3.82 · 10−9 2.5 · 10−7 6.67 · 10−5 6.28 · 10−8 2.26 · 10−14 1.41 · 10−4
0.001 1.33 · 10−12 2.5 · 10−10 6.67 · 10−7 6.38 · 10−11 6.2 · 10−13 1.41 · 10−6
0.1 4.8 · 10−3 5.61 · 10−3 3.74 · 10−2 7.18 · 10−3 2.06 · 10−1 2.1 · 10−2
4.68 · 10−7 1.87 · 10−6 3.4 · 10−4 8.25 · 10−6 2.19 · 10−2 2.12 · 10−4

T
A2 (t) 0.01
0.001 4.75 · 10−11 1.77 · 10−9 3.42 · 10−6 8.4 · 10−9 2.2 · 10−3 2.13 · 10−6

IP
0.1 8.79 · 10−3 9.09 · 10−3 9.47 · 10−2 1.42 · 10−2 3.78 · 10−1 4.35 · 10−2
AT oep,4 (t) 0.01 1.02 · 10−6 1.39 · 10−6 8.23 · 10−4 1.59 · 10−5 3.67 · 10−2 3.89 · 10−4
0.001 1.03 · 10−10 7.62 · 10−10 8.22 · 10−6 1.6 · 10−8 3.65 · 10−3 3.86 · 10−6

CR
0.1 3.99 · 10−4 5.96 · 10−4 9.48 · 10−3 1.05 · 10−3 6.98 · 10−2 6.21 · 10−3
Arand,10 (t) 0.01 4.29 · 10−8 2.52 · 10−7 9.28 · 10−5 1.14 · 10−6 6.77 · 10−3 6.15 · 10−5
0.001 4.87 · 10−12 2.34 · 10−10 9.32 · 10−7 1.15 · 10−9 6.75 · 10−4 6.19 · 10−7

According to Lemma 4.1, the discretization ZNN-Eb,1 produces the best results for A1m (t). Also, it

US
can be seen that ZNN-ABc,4 posseses the smallest maximal steady state residual for all test matrices
except A1m (t). Also, a general conclusion on the basis of data arranged in Table 1 is that the discretization
corresponding to smaller values of the sampling time τ produces better results.
AN
4.2 The influence of the parameter γ and the nonlinear activation
Consider again the matrix AT oep,4 (t) from Example 4.4. Figure 6 shows the influence of the gain parameter
γ on the residual norm kI − Ak Xk k, as well as the effect of using the nonlinear activation function h(x).
M

Two nonlinear activation functions are considered in this example.


The first one is power-sigmoid function, defined by
(
xp , if |x| ≥ 1
ED

hPS (x) = 1+exp(−q) 1−exp(−qx) , q ≥ 2, p ≥ 3;


1−exp(−q) 1+exp(−qx) , otherwise

The second one is smooth power-sigmoid function, defined by


1 p 1 + exp(−q) 1 − exp(−qx)
PT

hSPS (x) = x + , p ≥ 3, q ≥ 2.
2 1 − exp(−q) 1 + exp(−qx)
The linear activation function and various values of the gain parameter γ was used in the left graph, while
γ = 1 and different activations was used in generating the right graph. The parameters p = 3 and q = 2
CE

are used in both cases.


|I-Ak Xk | |I-Ak Xk |
AC

0.100 0.100

γ =1
0.001 lin
γ =5 0.001
PS
γ =10
10-5 SPS
γ =50 10 -5

10-7
10-7
t=kτ t=kτ
5 10 15 20 25 5 10 15 20 25 30 35

Figure 6: Residual norm kI − Ak Xk kF along the iterations for ZNN-ABc,4 method, matrix AT oep,4 (t)
and initial values InitInvRand (e = 0.1). Influence of the parameter γ (left) and the nonlinear activation
(right)

13
ACCEPTED MANUSCRIPT

As one can see, increasing of the γ parameter significantly accelerates the convergence rate at the
beginning, while the steady state residual norm is only slightly improved. The same conclusion is valid for
the nonlinear activation functions.
Finally, we make one more comparison of all methods, in the case when the nonlinear activation function
hSPS (x) was used. The rest of the parameters are the same as in Example 4.4. Numerical results are shown
in Figure 7.

T
|I-Ak Xk | |I-Ak Xk |

IP
10-3
ZNN-ABc,4 ZNN-ABc,4
0.01
ZNN-ABb,3 ZNN-ABb,3
10-5
ZNN-Tb,3 10-4 ZNN-Tb,3

CR
ZNN-T1b,3 ZNN-T1b,3
10-7 10-6
ZNN-Eb,1 ZNN-Eb,1
HP3t -8 HP3t
10
10-9
t=kτ t=kτ
5 10 15 20 25 30 5 10 15 20 25 30

US
Figure 7: Residual norm kI − Ak Xk kF along the iterations, for nonlinear activation function hSPS (x),
methods ZNN-ABc,4 , ZNN-ABb,3 , ZNN-Tb,3 , ZNN-T1b,3 , ZNN-Eb,1 and HP3t and the matrix
AT oep,4 (t). Left and right figures are obtained using InitInv and InitInvRand (e = 0.1) for the initial
AN
values respectively.

Recall that, when nonlinear activation function is used, each iteration of the methods ZNN-ABc,4 ,
ZNN-ABb,3 , ZNN-Tb,3 , ZNN-T1b,3 and ZNN-Eb,1 requires total 3 MMI. Therefore, in order to have
M

fair comparison, we used HP3t scheme instead of HP2t. It can be seen that the obtained results are very
similar to these in the linear case, while HP3t perform slightly better than HP2t. It is now comparable
with ZNN-T1b,3 and ZNN-ABb,3 . On the other hand, ZNN-ABc,4 still has the smallest residual norm.
ED

4.3 Fixed accuracy test


In the second test, we assume that maxin is large enough (we used maxin = 100) and corrective iterations
in the Algorithm 3.1 are performed until the residual norm kI −AXk kF is less than ε = 10−5 . Table 1 shows
PT

the total number of matrix multiplications required for computing all Xk where t = kτ ∈ [0, 2π]. Testing
was done on the matrix Arand,10 (t), with InitInv initial values choice. Parameter τ varied from 0.2 to
0.005, while γ = 1 is taken again. Corrective iterations are performed using the second order hyper-power
method (r = 2).
CE

Table 2. Total number of matrix multiplications required to compute all Xk for t = kτ ∈ [0, 2π], and for
different values of τ .
AC

τ ZNN-ABc,4 ZNN-ABb,3 ZNN-Tb,3 ZNN-T1b,3 ZNN-Eb,1 HP2t


0.2 378 378 396 382 434 394
0.1 504 504 716 560 756 756
0.05 976 980 1008 1008 1426 1208
0.01 2514 2514 3902 2538 5028 5028
0.005 5028 5028 5606 5028 10056 10056
0.001 25134 25134 25134 25134 32532 25134

Clearly, the ZNN-ABc,4 discretization rule requires the minimal total number of matrix multiplications
for all time sampling values τ .

14
ACCEPTED MANUSCRIPT

5 ZNN model for the Moore-Penrose inverse computation


In this section we consider the linear continuous-time ZNN model (2.3) in the singular case and show its
applicability in computing the Moore-Penrose inverse. In other words, we consider the model of the same
form (2.3), wherein A(t) can be rectangular and/or singular:

Ẋ(t) = −X(t)Ȧ(t)X(t) − γ(X(t)A(t)X(t) − X(t)), A(t) ∈ Rm×n , X(t) ∈ Rn×m . (5.1)

T
Some preliminary results are restated and introduced before main results. The differentiability of the
Moore-Penrose inverse A† (t) was considered by Golub and Pereyra in [24]. The following lemma is the

IP
direct corollary of [24, Theorem 4.3]:
Lemma 5.1. Assume that A(t) ∈ Rm×n has differentiable entries and constant rank for each t ∈ Ω, where

CR
Ω ⊂ R is an open set. If M (t) = A† (t), then

Ṁ = −M ȦM + M M T (A˙T )(I − AM ) + (I − M A)(A˙T )M T M (5.2)

is valid for an arbitrary t ∈ Ω.

US
The time variable t is omitted in (5.2) in order to simplify the notation. We also do the same in the
rest of this section, wherever it is possible.
AN
Hjorungnes and Gesbert [25] extended Lemma 5.1 to the case of a complex matrix A and the complex
variable t.
Assume that the matrix A(t) ∈ Rm×n has differentiable entries, its range and null space R(A(t)) and
N (A(t)) are constant for every t > 0 and equal to T 0 and S 0 , respectively. In that case, both the transpose
matrix AT (t) and the Moore-Penrose inverse A† (t) also have constant range and null space. We denote
M

these two spaces by T and S, respectively. Denote by PV the orthogonal projector on the subspace V .
Then AA† and A† A are orthogonal projectors PT 0 and PT respectively, and therefore are constant matrices.
All these assumptions and the notation are valid throughout the section.
ED

Proposition 5.1. The range and the null space of the time derivative Ȧ satisfy R(Ȧ) ⊂ R(A) = T 0 and
N (Ȧ) ⊃ N (A) = S 0 . Similarly, R((A˙T )) ⊂ R(AT ) = T and N ((A˙T )) ⊃ N (AT ) = S.
Proof. The proposition directly holds from
PT

Ȧ(t) = lim (A(t + ∆t) − A(t))/∆t, R(A(t + ∆t)) = R(A(t)) = T 0 , N (A(t + ∆t)) = N (A(t)) = S 0 .
∆t→0

The second statement follows analogously.


CE

Lemma 5.2 follows from Lemma 5.1, under the assumption that range and null space of A(t) are
constant.
AC

Lemma 5.2. Let A(t) ∈ Rm×n has differentiable entries, its range and null space R(A(t)) = T 0 and
N (A(t)) = S 0 are constant for every t > 0. Denote by M (t) = A† (t) the Moore-Penrose inverse of the
time-varying matrix A(t). Then M is also differentiable and Ṁ = −M ȦM .
Proof. According to assumptions, R(AT (t)) = T and N (A† (t)) = S are also constant and T and S 0 , as well
as T 0 and S are orthogonal complements. Since I −AM is the orthogonal projector on S = N (AT ) = N (M )
and N ((A˙T )) ⊃ S (Proposition 5.1), it follows that (A˙T )(I − AM ) = 0. Hence, the second term in (5.2)
vanishes.
In the same way, since I − M A is the orthogonal projector on S 0 = N (A), R((A˙T )) ⊂ T and T and
S 0 are orthogonal complements, it follows that (I − M A)(A˙T ) = 0. Hence, the third term in (5.2) also
vanishes.
The statement of the lemma follows directly from (5.2).

15
ACCEPTED MANUSCRIPT

Proposition 5.2. Given the dynamical system Ẋ = G(X) where G is an appropriate function satisfying
R(G(X)) ⊆ T , N (G(X)) ⊇ S, R(X(0)) = T and N (X(0)) = S. Then R(X) ⊆ T and N (X) ⊇ S.

Proof. According to the assumptions, we have ẊPS = G(X)PS = 0 and PS 0 Ẋ = PS 0 G(X) = 0. Therefore
XPS = X(0)PS = 0 and PS 0 X = PS 0 X(0) = 0, implying N (X) ⊇ S and R(X) ⊆ T (T and S 0 are
orthogonal complements). This completes the proof.

Theorem 5.1. Consider the dynamical system (5.1), wherein A(t) ∈ Rm×n has differentiable entries and

T
X(t) ∈ Rn×m . The continuous model (5.1) has exponential convergence to the Moore-Penrose inverse A† (t)
under the assumption that R(X(t)) = T ⊆ Rn and N (X(t)) = S ⊆ Rm are constant for every t ≥ 0.

IP
Proof. Consider the Zhang function E(t) = A(t)−X † (t). In the rest of the proof, we omit the time variable
t. Using Lemma 5.2 and (5.1) one gets

CR
Ė = Ȧ − (X˙ † )
= Ȧ + X † (−X ȦX − γ(XAX − X))X † (5.3)

US
† †
= Ȧ − X X ȦXX − γ(X XAXX − X ). † †

According to the assumption, XX † and X † X are orthogonal projectors on R(X) = T and R(X † ) = T 0
(T 0 is the orthogonal complement of S). Since R(Ȧ) ⊆ R(A) = T 0 (according to Proposition 5.1), it follows
AN
that
Ȧ = X † X Ȧ = ȦXX † , A = X † XA = AXX † .
Now, (5.3) implies
Ė = Ȧ − Ȧ − γ(A − X † ) = −γE. (5.4)
M

This completes the proof of the theorem.

Assume that σ1 (t) ≥ σ2 (t) ≥ · · · ≥ σr (t) > 0 are the non–zero singular values of A(t). The following
ED

theorem proves the local convergence property of the model (5.1), under additional assumption on the last
non–zero singular value σr (t) of A(t).

Theorem 5.2. Assume that σr (t) ≥ α > 0 for every t ≥ 0. If R(X(0)) = T , N (X(0)) = S and
kE(0)k = kA(0) − X † (0)k < α, the model (5.1) has the exponential convergence.
PT

Proof. According to Theorem 5.1, we only need to prove that R(X) = T and N (X) = S for every t ≥ 0.
Proposition 5.2 implies that R(X) ⊆ T and N (X) ⊇ S. The remaining part is to prove that rank(X(t)) = r
for every t > 0.
CE

Assume that the previous statement is not valid and let

t0 = inf{t > 0 | rank(X(t)) < r}.


AC

Due to the continuity of X(t) (i.e. its singular values), we also have rank(X(t0 )) < r. Moreover
rank(X(t)) = r for every t ∈ [0, t0 ). Now according to (5.4) (the proof of Theorem 5.1), we have

kA(t0 ) − X † (t0 )k = kE(t0 )k = kE(0)e−γt0 k


< kE(0)k < α.

On the other hand, since rank(A(t0 )) = r and rank(X(t0 )) < r, according to the well-known Eckart-Young-
Mirsky theorem [26], it follows
A(t0 ) − X † (t0 ) > σr (t0 ) ≥ α,

which is contradiction. This completes the proof of the theorem.

16
ACCEPTED MANUSCRIPT

Remark 5.1. The model of the form (5.1) was originally used in [20] for computing the time-varying
matrix pseudoinverse Ȧ† (t). Three main advantages of our results concerning that model with respect
to the previous research on solving continuous time-varying matrix pseudoinverse can be emphasized as
follows.
Firstly, all previous results on pseudoinverses were related to the full-rank matrices. The results of
theorems 5.1 and 5.2 are the first analysis which overcomes these assumptions.
The Zhang function E(t) = A(t) − X † (t) and the dynamic system of the form (5.1) (called ZNN-5) were

T
used in [20] for the online solution of the time-varying matrix pseudoinverse. But, the ZNN-5 model was
defined using the approximation Ȧ† (t) ≈ −A(t)Ȧ(t)A† (t) of Ȧ† , following the result from [27]. In Lemma

IP
5.2 we defined exact conditions which ensure this relation.
In addition, in theorems 5.1 and 5.2 we prove the exponential convergence of (5.1).

CR
The discretization of the model (5.1) goes in the same way as for the ordinary inverse. This means that
all schemes introduces in Section 3, as well as Algorithm 3.1, are also applicable in this case, for suitable
choice of the initial values. For that purpose, we consider the following two choices:
1. InitInv, which initializes Xi = A†i , i = −m, −m + 1, . . . , 0. This is the direct generalization of the

2US
corresponding initialization rule for the ordinary inverse.
2. InitTrans, which initializes Xi = 2AT
i /kAi kF , i = −m, −m + 1, . . . , 0.

Both the rules InitInv and InitTrans obviously satisfy the conditions N (Xi ) = N (AT i ) and R(Xi ) =
AN
R(AT i ), for every i = −m, −m + 1, . . . 0. Moreover, despite of limitations given by Theorem 5.2, usually all
discretizations converge when initialized by InitTrans rule.

6 Numerical example in singular case


M

Consider the matrix


A1,M P (t) = M0 + M1 cos(2t) + M2 sin(2t)
ED

where
   
15 −24 −42 12 18 −25 58 43 19 −27
 −45 33 27 −33 27   −10 39 22 26 −24 
  1 
M0 =  −45 −27 −9 −93 87 , M1 =  10 35 26 46 −52 ,
PT

  2 
 75 −141 −120 −18 54   15 −4 15 7 −27 
30 −90 −93 −27 69 −15 67 57 37 −55
 
5 −8 −14 4 6
CE

 −15 11 9 −11 9 
 
M2 = 
 −15 −9 −3 −31 29 .

 25 −47 −40 −6 18 
10 −30 −31 −9 23
AC

The matrix A1,M P (t) satisfies

rank(A1,M P (t)) = 3,
   
64 2 5 62 4 5
N (A1,M P (t)) = span , , , 0, 1 , − , − , , 1, 0 ,
45 3 9 45 3 9
   
T 19 59 47 13 145 69
N (A1,M P (t)) = span − , , − , 0, 1 , − , , − , 1, 0 .
14 56 56 14 56 56

Figure 8 shows the residual norms kAk Xk Ak − Ak kF , generated by applying the methods ZNN-ABc,4 ,
ZNN-ABb,3 , ZNN-Tb,3 , ZNN-T1b,3 , ZNN-Eb,1 on the matrix A1,M P (t). Left and right figures are
obtained using the rules InitInv and InitTrans, respectively, for generating the initial values.

17
ACCEPTED MANUSCRIPT

|Ak Xk Ak -Ak | |Ak Xk Ak -Ak |

1
10
0.100 ZNN-ABc,4 ZNN-ABc,4

0.010 ZNN-ABb,3 ZNN-ABb,3


0.100
0.001 ZNN-Tb,3 ZNN-Tb,3

-4 ZNN-T1b,3 ZNN-T1b,3
10 0.001
ZNN-Eb,1 ZNN-Eb,1
10-5
-6 -5
10 10

T
t=kτ t=kτ
5 10 15 5 10 15 20 25

IP
Figure 8: Residual norm kAk Xk Ak − Ak kF along the iterations for the matrix A1,M P (t) and the meth-
ods ZNN-ABc,4 , ZNN-ABb,3 , ZNN-Tb,3 , ZNN-T1b,3 , ZNN-Eb,1 , initialized by InitInv (left) and

CR
InitTrans (right).

|Ak Xk Ak -Ak |
1000

100

10

1
US HP2t
AN
0.10

0.01
t=kτ
5 10 15 20 25
M

Figure 9: Residual norm kAk Xk Ak − Ak kF along the iterations for the matrix A1,M P (t) and the method
HP2t, initialized by InitTrans.
ED

Figure 9 shows the residual norms for the matrix A1,M P (t) generated by applying the method HP2t
using InitTrans initialization.
It is evident that the HP2t scheme has highly frequent peaks in the residual norm. In these time points,
PT

the obtained residual norm kAk Xk Ak − Ak kF was higher than one produced by InitTrans initialization
rule at the same point. The scheme was then reinitialized, i.e. we set Xk := 2AT 2
k /kAk kF and continue.
Figure 9 shows the need for frequent initialization. The reason for such behaviour comes from the same
type of instability of hyper power methods (for constant matrices), as discussed in [28, 29].
CE

7 Conclusion
AC

A unified approach to all known Euler-type or Taylor-type discretizations of the ZNN models for the time-
varying inverse computation is proposed and considered. In addition, the general discretization scheme
which comprises all previously proposed DZNN models is derived by applying the general linear Multi-step
method. An iterative scheme arising from the 4th order Adams-Bashforth method is proposed and consid-
ered in details. Comparison of numerical results derived by various discretization schemes for computing
the matrix inverse is presented. A continuous-time ZNN model considered in the disretization is extended
for the pseudoinverse computation of an arbitrary rectangular and/or singular real matrix. Exponential
convergence of the model is verified under the assumption that the range and null space of the input time-
varying matrix are time invariant. In this way, the results derived in [20] for the ZNN-5 model are improved
and extended. Numerical properties of the DZNN model in the case of the pseudoinverse computation are
also considered.

18
ACCEPTED MANUSCRIPT

References
[1] J.-J. Climent, N. Thome, Y. Wei, A geometrical approach on generalized inverses by Neumann-type
series, Linear Algebra Appl. 332-334 (2001), 533-540.
[2] W. Li, Z. Li, A family of iterative methods for computing the approximate inverse of a square matrix
and inner inverse of a non-square matrix, Appl. Math. Comput. 215 (2010), 3433–3442.
[3] X. Liu, H. Jin, Y. Yu, Higher-order convergent iterative method for computing the generalized inverse

T
and its application to Toeplitz matrices, Linear Algebra Appl. 439 (2013), 1635–1650.

IP
[4] P.S. Stanimirović, F. Soleymani, A class of numerical algorithms for computing outer inverses, J.
Comput. Appl. Math. 263 (2014), 236–245.
[5] L. Weiguo, L. Juan, Q. Tiantian, A family of iterative methods for computing Moore-Penrose inverse

CR
of a matrix, Linear Algebra Appl. 438 (2013), 47–56.
[6] Y. Zhang, S. S. Ge, Design and analysis of a general recurrent neural network model for time-varying
matrix inversion, IEEE Transactions on Neural Networks 16(6) (2005), 1477–1490.
[7] B. Liao, Y. Zhang, Different complex ZFs leading to different complex ZNN models for time-varying

US
complex generalized inverse matrices, IEEE Trans. Neural Network Learn. Syst., 25 (2014), 1621–1631.
[8] Y. Zhang, Y. Yang, N. Tan, and B. Cai, Zhang neural network solving for time-varying full-rank matrix
Moore-Penrose inverse, Computing, 92 (2011), 97–121.
AN
[9] L. Jin, S. Li, B. Hu, RNN Models for dynamic matrix inversion: a control-theoretical perspective,
IEEE Transactions on Industrial Informatics, 99 DOI: 10.1109/TII.2017.2717079.
[10] L. Jin, S. Li, B. Liao, Z. Zhang, Zeroing neural networks: A survey, Neurocomputing 267 (2017),
597–604.
M

[11] Y. Zhang, W. Ma, B. Cai, From Zhang neural network to Newton iteration for matrix inversion, IEEE
Transactions on Circuits and Systems-I: Regular Papers 56(7) (2009), 1405–1415.
[12] I. Stojanović, P.S. Stanimirović, I.S. Živković, D. Gerontitis, X.-Z. Wang, ZNN models for computing
ED

matrix inverse based on hyperpower iterative methods, Filomat 31 (2017), 2999–3014.


[13] L. Jin, Y. Zhang, Discrete-time Zhang neural network of O(τ 3 ) pattern for time-varying matrix pseu-
doinversion with application to manipulator motion generation, Neurocomputing 142 (2014), 165–173.
[14] M. Mao, J. Li, L. Jin, Y. Zhang, Enhanced discrete-time Zhang neural network for time-variant matrix
PT

inversion in the presence of bias noises, Neurocomputing 207 (2016), 220–230.


[15] Y. Zhang, L. Jin, D. Guo, S. Fu, L. Xiao, On the variable step-size of discrete-time zhang neural
network and Newton iteration for constant matrix inversion, in: Intelligent Information Technology
CE

Application, 2008. IITA’08. Second International Symposium on, volume 1, IEEE, 2008, pp. 34–38.
[16] Y. Zhang, B. Cai, M. Liang, W. Ma, Three nonlinearly-activated discrete-time ZNN models for time-
varying matrix inversion, 2012 8th International Conference on Natural Computation (ICNC 2012),
163–167.
AC

[17] Y. Zhang, D. Guo, Y. Yin, Y. Chou, Taylor-type 1-step-ahead numerical differentiation rule for first-
order derivative approximation and ZNN discretization, J. Computat. Appl. Math. 273 (2015), 29–40.
[18] D. Guo, Y. Zhang, Zhang neural network, Getz-Marsden dynamic system, and discrete-time algorithms
for time-varying matrix inversion with application to robots’ kinematic control, Neurocomputing 97
(2012), 22-32.
[19] D. Guo, Z. Nie, L. Yan, Novel discrete-time Zhang neural network for time-varying ma-
trix inversion, IEEE Transactions on Systems, Man, and Cybernetics: Systems (2017), DOI:
10.1109/TSMC.2017.2656941.
[20] B. Liao, Y. Zhang, From different ZFs to different ZNN models accelerated via Li activation functions
to finite-time convergence for time-varying matrix pseudoinversion, Neurocomputing, 133 (2014),
512–522.

19
ACCEPTED MANUSCRIPT

[21] D.S. Djordjević, P.S. Stanimirović, Y. Wei, The representation and approximation of outer generalized
inverses, Acta Math. Hungar. 104 (2004), 1–26.
(2)
[22] Y. Wei, H. Wu, The representation and approximation for the generalized inverse AT,S , Appl. Math.
Comput. 135 (2003), 263–276.
[23] J.C. Butcher, Numerical Methods for Ordinary Differential Equations, John Wiley, 2003.
[24] G.H. Golub, V. Pereyra, The differentiation of pseudo-inverses and nonlinear least squares problems

T
whose variables separate, SIAM Journal on Numerical Analysis 10:2 (1973), 413–432.
[25] A. Hjorungnes, D. Gesbert, Complex-Valued Matrix Differentiation: Techniques and Key Results,

IP
IEEE Transactions on Signal Processing 55(6) (2007), 2740–2746.
[26] C. Eckart, G. Young, The approximation of one matrix by another of lower rank, Psychometrika 1

CR
(1936), 211-218.
[27] Y. Zhang, Y. Wang, L. Jin, B. Mu, H. Zheng, Different ZFs leading to various ZNN models illus-
trated via online solution of time-varying underdetermined systems of linear equations with robotic
application, in Lecture Notes in Computer Science 7952 (2013), 481-488.

US
[28] M.D. Petković, P.S. Stanimirović, Iterative method for computing Moore-Penrose inverse based on
Penrose equations, J. Comput. Appl. Math. 235 (2011), 1604–1613.
[29] M.D. Petković, P.S. Stanimirović, Two improvements of the iterative method for computing Moore-
Penrose inverse based on Penrose equations, J. Comput. Appl. Math. 267 (2014), 61–71.
AN
M
ED
PT
CE
AC

20
ACCEPTED MANUSCRIPT

T
Marko D. Petković received his M.Sc. degrees in Mathematics and Computer
Science, and in Telecommunications, in 2006 and 2007 respectively from the University of Ni. He received a

IP
PhD degree in Computer Science in 2008 from the University of Ni. Currently he works as a Full Professor
at the Faculty of Sciences and Mathematics, University of Ni, Serbia. His research interests include the
numerical linear algebra, special functions, source and channel coding and optimization methods. He is

CR
the author of 90 papers (64 of them in peer-reviewed international journals), and the member of editorial
board of 4 journals.

US
AN

Predrag S. Stanimirović received his M.Sc. in 1990, Ph.D. in 1996 from the
M

University of Ni.Since 1996 he has worked as assistant professor, since 1999 as associate professor and
since 2003 as full professor. He has published more than 200 journal papers in scientific journals. He
is a reviewer and editorial board member of many scientific journals. His research interests include the
following areas: numerical linear algebra, multilinear algebra, symbolic computation and linear and non-
ED

linear programming.
PT
CE
AC

Vasilios N. Katsikis received his Diploma of Mathematics from the National and
Kapodistrian University of Athens, his M.Sc. in Applied Mathematics and his Ph. D. in Mathematics
from the National Technical University of Athens. He belongs to the teaching and research stuff of the
Department of Economics, Division of Mathematics and Informatics, National and Kapodistrian University
of Athens, as an assistant professor of Mathematics and Informatics. His research interests lie in the areas
of Computational Mathematics, Matrix Analysis and Linear Algebra. He has published more than 40
journal papers and he serves as a reviewer for many journals and congresses.

21

You might also like