Traps In Stochastic Approximation

Eashvar Srinivasan


28 November 2023

Table Of Contents

1 Introduction
2 Traps
3 Traps discussed in this paper
4 The main deal
5 Assumptions and Notations
6 Cyclic Trap
7 Example of cyclic trap
8 Theorem on cyclic trap
9 Repulsion and singular equilibrium
10 Proof of Theorem 2
11 One-dimensional general singular equilibrium
12 References

We look at the Rd −valued stochastic algorithm defined on (Ω, A, P)

(1) zn+1 = zn + γn h(zn ) + ηn+1 where
h is a map from open G ⊆ Rd to Rd
γn is positive and decreasing with Σ∞
i=0 γn = ∞
ηn is a small stochastic disturbance.
When the algorithm is bounded, the limit sets of (1) with the
properties of the ODE ż = h(z) are compact connected invariant and

Now due to the stochastic disturbance, there will be a few limit sets
that (1) will avoid as traps.
The simplest example of a trap are the regular zeroes of h.
z ∗ is a regular zero if it is an isolated zero with a neighborhood where
h is C 1 with a Lipschitz differential Dh, having at z ∗ at least one
eigenvalue with a positive real part.
Unstable equilibrium points

Traps discussed in this paper

This paper looks at some other compact connected chain-recurrent sets L

such as
periodic cycles for the ODE,
singular equilibria and other repulsive regions
connected sets of equilibria.

The main deal

Question: Does {w : d(zn (w ), L) → 0} have probability 0?

Assumptions and Notations

The stochastic disturbance ηn+1 = cn (ϵn+1 + rn+1 )

cn > 0, γn = O(cn ), Σγn < ∞, cn ̸= 0 i.o.
ϵn , rn ∈ Rd are random vector sequences which on
{w : d(zn (w ), L) → 0} satisfy E [ϵn |Fn ] = 0 and Σ||rn ||2 < ∞
Stability of iterates
Notation: λmin (A) denotes the smallest eigenvalue of the symmetric
matrix A
S(L) = {w : d(zn (w ), L) → 0} for any set L
Lr = {x ∈ Rd |d(x, L) < r }
C is a generic positive constant whose value may change as needed

Cyclic Trap

Let L ⊆ G , a closed and periodic orbit be a solution to the ODE

ż = h(z)
Assumption: h is C 2 on a neighbourhood of L.
L is a periodic and hyperbolic cycle having at least one characteristic
ezponent with a positive part
Such a trap is called cyclic.

Example of Cyclic Trap

Eashvar Srinivasan (IISc) Traps In Stochastic Approximation [2]

Cyclic Trap (contd.)

In this setup we claim the following theorem:

Theorem 1
Let L be a cyclic trap of the stochastic algorithm (1) under the
Assumptions A1. Assume that for some a > 2, a.s. on S(L),
(1) limsup E [||ϵn+1 ||a |Fn ] < ∞ (2) liminf E [λmin (ϵn+1 ϵT
n+1 )|Fn ] < ∞
Then P(S(L)) = 0

Repulsion and singular equilibria

A compact connected set L which is invariant and chain-recurrent for

the ODE will be called repulsive if there exists r > 0 such that any
solution to the ODE, (z(t))t≥0 , starting from z ∈ Lr \L leaves Lr
within finite time.
We look at the simplest case of singular repulsive equilibria. z ∗ is a
singular repulsive equilibrium of h if it is an isolated zero of h, such
that h is C 1 on a neighbourhood of z ∗ with a Lipschitz differential Dh
s.t. Dh(z ∗ ) = 0
We assume γn = gn . Under this setting,

Theorem 2 (d-dimensional repulsive singular equilibrium)

Assume a.s. on S(z ∗ ) for a > 4,
(1) limsup E [||ϵn+1 ||a |Fn ] < ∞ (2) liminf E [(ϵn+1 ϵT
n+1 )|Fn ] > 0

Then P(S(z )) = 0

Proof of Theorem 2

Let Vz ∗ be a neighbourhood of z ∗ where Dh is Lipschitz and such

that any solution of the ODE starting from Vz ∗ leaves Vz ∗ \z ∗ leaves
Vz ∗ within a finite time. As Dh(z ∗ ) = 0, for all
z ∈ Vz ∗ , ||h(z)|| ≤ C ||z − z ∗ ||2
On S(L),

Here sn+1 = γn ≈ glogn
Note that all of the following cannot be simultaneously true:

Proof (contd.)
Thus if limn→∞ ||zn − z ∗ || = 0 then Y = z ∗ . Hence
P({Y ̸= z ∗ } ∩ S(z ∗ ) = 0
If Y = z ∗ ∩ S(z ∗ ),

Define Sp = {Y = z ∗ } ∩ {S(z ∗ )} ∩ {w : zn (w ) ∈ Vz ∗ for n ≥ p}.

Almost surely on Sp , ||h(zn )||2 = O(||zn − z ∗ ||4 )
Under properties of the noise, by reference [3], we obtain P(Sp ) = 0.
Since {Y = z ∗ } ∩ {S(z ∗ )} = ∪Sp , we are done.
One-dimensional general singular equilibrium

Set d = 1. We consider an isolated zero z ∗ of h such that on a

neighborhood of z ∗ , for some α > 0 and integer p ≥ 2, h(z) ≈ α(z − z ∗ )p .
We do not assume repulsiveness. Then,
Theorem 3
If p is odd then (1) avoids z ∗ .
Otherwise on S(z ∗ ), almost sure convergence of ||zn − z ∗ || happens at the
extremely slow rate of O((logn) p−1 )

Thank you!

1 Some Pathological Traps for Stochastic Approximation, Odile

Brandiere, 1998
2 Steering undulatory micro-swimmers in a fluid flow through
reinforcement learning, Khiyati et al, 2023
3 Les algorithmes stochastiques contournent-ils les pieges, O Brandiere,

