LET Us A An, A M.E.A,, N 2, - . - . Re: T. P. Krasulina

THE METHOD OF STOCHASTIC APPROXIMATION FOR
THE DETERMIN.4TION OF THE LEAST

EIGENVALUE OF A SYMMETRIC4L
MATRIX *
T. P. KRASULINA
Leningrad
(Received 8 Januar.y 1968)
LET us consider the problem of finding the least eigenvalue and eigenvector of
the matrix A with elements aij, 1 < i < p, 1 < j < p, which is the
mathematical expectation of the random matrices An,A = M.E.A,, n = 1, 2,.. . .
Re shall assume that the An are independent, evenly distributed random
matrices and that s eigenvalues &, h2,. . . , L of the matrix A exist with
multiplicities kl, kz, . . . , h:
(AX, W
I.,=minO, h.=rnaxm.
(X, w 7
We shall use Uij, 1 < j < ki, to denote the eigenvectors corresponding to
the eigenvalue Xi, and forming a basis of the space of p-dimensional vectors
which is a Pi -subspace spanned by uij, 1 < f < ki.
We shall use the method of stochastic approximation introduced by Robbins

and Wolfowitz [ll. This method is intended for finding the minimum point of a
regression M(x)with respect
function to observations of the random quantity
Y(x), M.E.Y(x) = M(x).
There is usually genuine difficulty in proving the convergence of the method

to the minimum point of the regression function in the case of a non-unique
extremum. We shall prove below the convergence of the process of stochastic
approximation for finding the least eigenvalue and vector for the mathematical
expectation of random matrices to the minimum point of the functional (AX,X)/
(X,X).
In fact the points at which the gradient of the functional (AX,X)/X,X)

._
Mat.mat.Fit.,
*Zh.u,~chisl. 9, 6.1383-1387, 1969
189
190 T. P. Krasulina
vanishes are minima, maxima or points of inflection of this functional. The

eigenvectors Uij, 1 < i & ki, Uej, 1 < j G kq give a minimum or maximum for
the functional (AX, X)/(X, X), and the eigenvectors ui, (i + I, s; I G j < kc)
points of inflection. Assuming the symmetry of the matrix it can be shown that
the process converges to the minimum eigenvalue and corresponding
eigenvector of the matrix.
Let us construct a sequence of random vectors: X, is some initial random

vector,
(An-L, Xn) x
Xn+i = X, - yn A,,&, -
(Xn,L) nI’ (1)
where y,, > o is a sequence of constants. This algorithm is an extension of

the well-known gradient method for finding the least eigenvalue and corresponding
eigenvector of the symmetrical matrix [21. The following theorem gives
sufficient conditions for the convergence of the given process.
Theorem
Suppose that the following conditions are satisfied:
(1) M. E. /AnjlZ < m, JIA,(I is the spherical norm of the matrix A,;
(2) M.E. llXllll > 0, Where X,, is the projection of the vector X, on the
subspace P,;
m
Then X, - U, (0) with probability 1, where U,(w) is the random eigenvector

n-Co
corresponding to the eigenvalue Xi and
with probability 1.
The proof of this theorem will be preceded by a series of lemmas.
Lemma 1
Eigenvalue of a symmetrical matrix 191
Under the conditions of the theorem the sequence IIX,II converges with
probability 1 as n-t 00 i.e. 1im.M.E.IIX, (I2< 00.
n-+30
Proof. On the basis of (1) we have
llx,+*11~ = ILL112 + Yn2(En,En),

(2)
Thus
i (1 + Y:M. E. 11A, 11”)n-&, 1)

i=n
is a semimartingale and, on the basis of the theorem for the convergence of

semimartingales, 11X,I( [3], converges with probability 1 and IimE IljX, ]I2< 00.
n-%2
Lemma 2
The sequence p(X,) = (AX,, Xn) I (Xn, X,) converges with probability 1
to some limiting value P(o) as n + 00
Proof. From (1) we have
1 (En, A-G) (AL En)

P(L)- 2Yn + Yn2
F(Xnfi)= 1+yn2(~n,LI)/(Xn,Xn) (XL, X7x) (X78,Xn) I
We then obtain
1
1 Win, En)
p(x,)--Ynf(Xn)--YfJ~ +rn2 (x,,x,) 1 '
p(xn+i)=
1+ yn2(%nr %n)/(L, X7%)
(AX,, A-L) (AX,, Xn)z

fFn)=
(X73,X,) - vnrxn)2 ’
Z, = (tAn- A)X,, AX,) ((An -A).&, Xn)
(A-L, Xn).
(Xn,Xn) - (-L-G)
From the Cauchy inequality f (X,) go 0. By virtue of conditions 1) and 3)
of the theorem
192 T. P. Krasulina
converge with probability 1.
Let us assume that a set of sampling sequences {X,} exists of positive

probability such that
(3)
converge, and
lim inf I*(X,) < lim Sup p(Xn).

n n
Then we can choose numbers a and b such that
liminfy(X,)<a<b<limsup~(Xn).
7% 1
We find m, and n, such that ml > ni > N, p(X,,)< a, P(-G,) >b, n1 -=I
i c mi, ~<p(Xj)<b.
Consequently
ml-1
1
ILGLJ- P(Xn,) TI >b-a.
i=n,
I+ yf’(Ei, Ef)/(Xi, Xi)
On the ot’her hand we have
m-i m-i
PVm)- FL(Xn)11
1
= 2 yj2 ;;“‘;i:” -
i=n
(1 + Yf'(ff, Ei)l(Xit Xi) [ j=n 3, J
Yi’(Ei,
Si)l(‘i,
xi)l-‘*
m-i
zyjf(Xj)- 2yjZj
1 r[ [i +
i=j
It follows from (3) that

Eigenvalue of a symmetrical matrix 193
m*-1
1 b -a
<-.
P&n,)- P@nJ II
1+ y?(Sr, Ei)/Vi, Xt) 2
i=n,
The contradiction so obtained proves the truth of the lemma.
Lemma 3
Under the conditions of the theorem ai;’ = (X,, U~C), where 1 SZ i G ki,
converge with probability 1 to a,,(w).
The proof is similar to that of Lemma 2.
Lemma 4
Proof. p(X,)--+ hi with probability 1. By virtue of condition 2 of the

7z-bm
theorem there exists an i, such that M.E.laitij

> 0; consequently we have
aiio(n+i)= a:::+ yn(p(Xn)- hi)ailf,) + yn%‘,

((An -A)Xn,X?l)
Z,‘=((A-An)Xn, UliJ+ (-L, Uii,).
Kh Xn)
Hence it follows that
By virtue of (2) we obtain

(u
IIx,,IIz < ‘17 (1 + 2y1211Ai112)llX1112
= Y2(@),
i=,
M. ~.yz(o) = n (1+ 2y<% JSIIAiII’)M.EIIXIII~ < 00.

i=i
ThenM. E.~(X,,) 1ari,[ /M. E.IUii,{-*. E.p(@) Iatr,(a) I iM.Elaii,(@) 1. If we

n-+0
(n)
assume that M.E.~(w) Ialto I /M.E.lUii,,(@) I > Ai, then M.E.IaIiOI--- Ool
n-m
which contradicts (4). Since p(o) 2 hi, it follows that P(~(w) = hi) = 1.
Proof of the theorem. Ke write X, in the basis Ui,j, 16 id hi:

194 2’. P. Krasulina
X_=ii fLtj”Ufij.
f-i j-t
From Lemma 4, in the same way as for Lemma 3, we can prove that
a:r’-a-W af j (0) with probability 1. It follows from Lemma 2 that
converges with probabiiity 1 and by virtue of condition 3 of the theorem

lim supf(Xn) = 0.
n
Viealso find that
converges to 0 with pro~bility 1 as nrt 00. Since
on the basis of Lemmas 3*and 4 we find that adi = 0, if i p 1. Consequently

X,(O) - vi(w), where U,(o) is a random eigenvector corresponding to the
n-PC0
eigenvalue hi. Since from condition 1) of the theorem
; i; ((-h ;A:“; XA) __ o

A> A 7Z-rm
A=1
with probability 1 (see 141, Section 29. 1. D), by virtue of Lemma 4 we obtain
the second statement of the theorem.
The author wishes to express his thanks to S. M. Ermakov for his interest
in this paper.
TransEatedby H. F. Cleaves
REFERENCES
1. KIEFER, J.and WOLFOWITZ, J. Stochastic estimation of the maximum of a

Eigenualue of a symmetrical matrix 195
regression function. Ann. Math. Statistics, 23, 3, 463-466, 1952.
2. FADDEEV, D. K. and FADDEEVA, V. N. Computational methods of linear algebra

Wychislitel’nye meto@ lineinio algebra). M. - L., Fizmatgiz, 1963.
English translation, Freeman, San Francisco, 1964.
3. DOOB, D. L. Stochastic processes. N. Y., 1953.

Russian translation fueroyatnostnse protsess?). Foreign Literature Publishing
House, Moscow, 1956.
4. LOEVE, M., Probability theory. Van Nostrand, 1960.

Russian translation (Teoriya ueroyatnosteil. Foreign Literature Publishing
House, Moscow, 1962.

LET Us A An, A M.E.A,, N 2, - . - . Re: T. P. Krasulina

Uploaded by

Copyright:

Available Formats

You might also like

LET Us A An, A M.E.A,, N 2, - . - . Re: T. P. Krasulina

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

LET Us A An, A M.E.A,, N 2, - . - . Re: T. P. Krasulina

Uploaded by

Copyright:

Available Formats

THE METHOD OF STOCHASTIC APPROXIMATION FOR

THE DETERMIN.4TION OF THE LEAST

(Received 8 Januar.y 1968)

We shall use the method of stochastic approximation introduced by Robbins

There is usually genuine difficulty in proving the convergence of the method

In fact the points at which the gradient of the functional (AX,X)/X,X)

vanishes are minima, maxima or points of inflection of this functional. The

Let us construct a sequence of random vectors: X, is some initial random

where y,, > o is a sequence of constants. This algorithm is an extension of

Suppose that the following conditions are satisfied:

Then X, - U, (0) with probability 1, where U,(w) is the random eigenvector

corresponding to the eigenvalue Xi and

The proof of this theorem will be preceded by a series of lemmas.

llx,+*11~ = ILL112 + Yn2(En,En),

i (1 + Y:M. E. 11A, 11”)n-&, 1)

is a semimartingale and, on the basis of the theorem for the convergence of

Proof. From (1) we have

1 (En, A-G) (AL En)

(AX,, A-L) (AX,, Xn)z

converge with probability 1.

Let us assume that a set of sampling sequences {X,} exists of positive

lim inf I*(X,) < lim Sup p(Xn).

Then we can choose numbers a and b such that

On the ot’her hand we have

It follows from (3) that

The contradiction so obtained proves the truth of the lemma.

The proof is similar to that of Lemma 2.

Proof. p(X,)--+ hi with probability 1. By virtue of condition 2 of the

theorem there exists an i, such that M.E.laitij

aiio(n+i)= a:::+ yn(p(Xn)- hi)ailf,) + yn%‘,

Hence it follows that

By virtue of (2) we obtain

M. ~.yz(o) = n (1+ 2y<% JSIIAiII’)M.EIIXIII~ < 00.

ThenM. E.~(X,,) 1ari,[ /M. E.IUii,{-*. E.p(@) Iatr,(a) I iM.Elaii,(@) 1. If we

Proof of the theorem. Ke write X, in the basis Ui,j, 16 id hi:

a:r’-a-W af j (0) with probability 1. It follows from Lemma 2 that

converges with probabiiity 1 and by virtue of condition 3 of the theorem

converges to 0 with pro~bility 1 as nrt 00. Since

on the basis of Lemmas 3*and 4 we find that adi = 0, if i p 1. Consequently

; i; ((-h ;A:“; XA) __ o

1. KIEFER, J.and WOLFOWITZ, J. Stochastic estimation of the maximum of a

regression function. Ann. Math. Statistics, 23, 3, 463-466, 1952.

2. FADDEEV, D. K. and FADDEEVA, V. N. Computational methods of linear algebra

3. DOOB, D. L. Stochastic processes. N. Y., 1953.

4. LOEVE, M., Probability theory. Van Nostrand, 1960.

You might also like