LET Us A An, A M.E.A,, N 2, - . - . Re: T. P. Krasulina

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

THE METHOD OF STOCHASTIC APPROXIMATION FOR

THE DETERMIN.4TION OF THE LEAST


EIGENVALUE OF A SYMMETRIC4L
MATRIX *

T. P. KRASULINA

Leningrad

(Received 8 Januar.y 1968)

LET us consider the problem of finding the least eigenvalue and eigenvector of
the matrix A with elements aij, 1 < i < p, 1 < j < p, which is the
mathematical expectation of the random matrices An,A = M.E.A,, n = 1, 2,.. . .
Re shall assume that the An are independent, evenly distributed random
matrices and that s eigenvalues &, h2,. . . , L of the matrix A exist with
multiplicities kl, kz, . . . , h:

(AX, W
I.,=minO, h.=rnaxm.
(X, w 7
We shall use Uij, 1 < j < ki, to denote the eigenvectors corresponding to
the eigenvalue Xi, and forming a basis of the space of p-dimensional vectors
which is a Pi -subspace spanned by uij, 1 < f < ki.

We shall use the method of stochastic approximation introduced by Robbins


and Wolfowitz [ll. This method is intended for finding the minimum point of a
regression M(x)with respect
function to observations of the random quantity
Y(x), M.E.Y(x) = M(x).

There is usually genuine difficulty in proving the convergence of the method


to the minimum point of the regression function in the case of a non-unique
extremum. We shall prove below the convergence of the process of stochastic
approximation for finding the least eigenvalue and vector for the mathematical
expectation of random matrices to the minimum point of the functional (AX,X)/
(X,X).

In fact the points at which the gradient of the functional (AX,X)/X,X)


._
Mat.mat.Fit.,
*Zh.u,~chisl. 9, 6.1383-1387, 1969

189
190 T. P. Krasulina

vanishes are minima, maxima or points of inflection of this functional. The


eigenvectors Uij, 1 < i & ki, Uej, 1 < j G kq give a minimum or maximum for
the functional (AX, X)/(X, X), and the eigenvectors ui, (i + I, s; I G j < kc)
points of inflection. Assuming the symmetry of the matrix it can be shown that
the process converges to the minimum eigenvalue and corresponding
eigenvector of the matrix.

Let us construct a sequence of random vectors: X, is some initial random


vector,

(An-L, Xn) x
Xn+i = X, - yn A,,&, -
(Xn,L) nI’ (1)

where y,, > o is a sequence of constants. This algorithm is an extension of


the well-known gradient method for finding the least eigenvalue and corresponding
eigenvector of the symmetrical matrix [21. The following theorem gives
sufficient conditions for the convergence of the given process.

Theorem

Suppose that the following conditions are satisfied:

(1) M. E. /AnjlZ < m, JIA,(I is the spherical norm of the matrix A,;

(2) M.E. llXllll > 0, Where X,, is the projection of the vector X, on the
subspace P,;
m

Then X, - U, (0) with probability 1, where U,(w) is the random eigenvector


n-Co

corresponding to the eigenvalue Xi and

with probability 1.

The proof of this theorem will be preceded by a series of lemmas.

Lemma 1
Eigenvalue of a symmetrical matrix 191

Under the conditions of the theorem the sequence IIX,II converges with
probability 1 as n-t 00 i.e. 1im.M.E.IIX, (I2< 00.
n-+30
Proof. On the basis of (1) we have

llx,+*11~ = ILL112 + Yn2(En,En),


(2)

Thus

i (1 + Y:M. E. 11A, 11”)n-&, 1)


i=n

is a semimartingale and, on the basis of the theorem for the convergence of


semimartingales, 11X,I( [3], converges with probability 1 and IimE IljX, ]I2< 00.
n-%2

Lemma 2

The sequence p(X,) = (AX,, Xn) I (Xn, X,) converges with probability 1
to some limiting value P(o) as n + 00

Proof. From (1) we have

1 (En, A-G) (AL En)


P(L)- 2Yn + Yn2
F(Xnfi)= 1+yn2(~n,LI)/(Xn,Xn) (XL, X7x) (X78,Xn) I

We then obtain

1
1 Win, En)
p(x,)--Ynf(Xn)--YfJ~ +rn2 (x,,x,) 1 '
p(xn+i)=
1+ yn2(%nr %n)/(L, X7%)

(AX,, A-L) (AX,, Xn)z


fFn)=
(X73,X,) - vnrxn)2 ’
Z, = (tAn- A)X,, AX,) ((An -A).&, Xn)
(A-L, Xn).
(Xn,Xn) - (-L-G)
From the Cauchy inequality f (X,) go 0. By virtue of conditions 1) and 3)
of the theorem
192 T. P. Krasulina

converge with probability 1.

Let us assume that a set of sampling sequences {X,} exists of positive


probability such that

(3)

converge, and

lim inf I*(X,) < lim Sup p(Xn).


n n

Then we can choose numbers a and b such that

liminfy(X,)<a<b<limsup~(Xn).
7% 1

We find m, and n, such that ml > ni > N, p(X,,)< a, P(-G,) >b, n1 -=I
i c mi, ~<p(Xj)<b.

Consequently
ml-1
1
ILGLJ- P(Xn,) TI >b-a.
i=n,
I+ yf’(Ei, Ef)/(Xi, Xi)

On the ot’her hand we have

m-i m-i

PVm)- FL(Xn)11
1
= 2 yj2 ;;“‘;i:” -
i=n
(1 + Yf'(ff, Ei)l(Xit Xi) [ j=n 3, J

Yi’(Ei,
Si)l(‘i,
xi)l-‘*
m-i
zyjf(Xj)- 2yjZj
1 r[ [i +
i=j

It follows from (3) that


Eigenvalue of a symmetrical matrix 193

m*-1
1 b -a
<-.
P&n,)- P@nJ II
1+ y?(Sr, Ei)/Vi, Xt) 2
i=n,

The contradiction so obtained proves the truth of the lemma.

Lemma 3

Under the conditions of the theorem ai;’ = (X,, U~C), where 1 SZ i G ki,
converge with probability 1 to a,,(w).

The proof is similar to that of Lemma 2.

Lemma 4

Proof. p(X,)--+ hi with probability 1. By virtue of condition 2 of the


7z-bm

theorem there exists an i, such that M.E.laitij


> 0; consequently we have

aiio(n+i)= a:::+ yn(p(Xn)- hi)ailf,) + yn%‘,


((An -A)Xn,X?l)
Z,‘=((A-An)Xn, UliJ+ (-L, Uii,).
Kh Xn)

Hence it follows that

By virtue of (2) we obtain


(u
IIx,,IIz < ‘17 (1 + 2y1211Ai112)llX1112
= Y2(@),
i=,

M. ~.yz(o) = n (1+ 2y<% JSIIAiII’)M.EIIXIII~ < 00.


i=i

ThenM. E.~(X,,) 1ari,[ /M. E.IUii,{-*. E.p(@) Iatr,(a) I iM.Elaii,(@) 1. If we


n-+0
(n)
assume that M.E.~(w) Ialto I /M.E.lUii,,(@) I > Ai, then M.E.IaIiOI--- Ool
n-m

which contradicts (4). Since p(o) 2 hi, it follows that P(~(w) = hi) = 1.

Proof of the theorem. Ke write X, in the basis Ui,j, 16 id hi:


194 2’. P. Krasulina

X_=ii fLtj”Ufij.

f-i j-t

From Lemma 4, in the same way as for Lemma 3, we can prove that

a:r’-a-W af j (0) with probability 1. It follows from Lemma 2 that

converges with probabiiity 1 and by virtue of condition 3 of the theorem


lim supf(Xn) = 0.
n
Viealso find that

converges to 0 with pro~bility 1 as nrt 00. Since

on the basis of Lemmas 3*and 4 we find that adi = 0, if i p 1. Consequently


X,(O) - vi(w), where U,(o) is a random eigenvector corresponding to the
n-PC0
eigenvalue hi. Since from condition 1) of the theorem

; i; ((-h ;A:“; XA) __ o


A> A 7Z-rm
A=1

with probability 1 (see 141, Section 29. 1. D), by virtue of Lemma 4 we obtain
the second statement of the theorem.

The author wishes to express his thanks to S. M. Ermakov for his interest
in this paper.

TransEatedby H. F. Cleaves

REFERENCES

1. KIEFER, J.and WOLFOWITZ, J. Stochastic estimation of the maximum of a


Eigenualue of a symmetrical matrix 195

regression function. Ann. Math. Statistics, 23, 3, 463-466, 1952.

2. FADDEEV, D. K. and FADDEEVA, V. N. Computational methods of linear algebra


Wychislitel’nye meto@ lineinio algebra). M. - L., Fizmatgiz, 1963.
English translation, Freeman, San Francisco, 1964.

3. DOOB, D. L. Stochastic processes. N. Y., 1953.


Russian translation fueroyatnostnse protsess?). Foreign Literature Publishing
House, Moscow, 1956.

4. LOEVE, M., Probability theory. Van Nostrand, 1960.


Russian translation (Teoriya ueroyatnosteil. Foreign Literature Publishing
House, Moscow, 1962.

You might also like