Professional Documents
Culture Documents
What NPC
What NPC
maurice.gauche
November 2023
I repeated during this project one error, which is writing Nylstrom instead of Nystrom.
1
Remark: Here we take R = U ΣV ⊤ as the SVD (not truncated) and Û = QU is orthogonal.
This shows that the SVD of AN yst is Û Σ2 Û ⊤ .This is the reason why the rank-k truncated approximation
of AN yst do coincide with Ûk Σ2k Ûk as defined in Algorithm 1.
Ω⊤ = [Ω⊤ ⊤ ⊤
(1) Ω(2) ... Ω(P ) ] (1)
where Ω⊤ n
p
(i) = P l DLi RHDRi , with:
• DLi ∈ Rl×l and DRi ∈ Rn/P ×n/P are diagonal with independent random signs.
• H ∈ Rn/P ×n/P is a normalized Walsh-Hadamard matrix
• R ∈ Rl×n/P is a uniform sampling matrix
The choice of the block Walsh-Hadamard matrix is based on following result seen during the 24th of
October’s lecture [2]:
Ω as given in (1) is OSE(m,ϵ,δ), when l = O(ϵ−2 (m + ln nδ ) ln m
δ )
3 Numerical stability
We will run our tests on n = 1024. Two data matrices will be used to test our algorithm as given in [3]: the
Polynomial Decay matrices and the Exponential Decay matrices with parameter R, q and p.
We first consider the Nyström approximation without k-rank truncation (we force k=l) and obtain the
following plots in Figure 1 and in Figure 2. These are the computation of the relative norm of the Nyström
low rank approximation (solid line) and the rank-k truncation (dashed line) with the regard to the nuclear
norm.
Figure 1 shows that for polynomial decay the relative error for the Nyström algorithm with SRHT sketching
matrices follows on a log scale the general direction of the k-Rank truncation.
Figure 2 shows that for a ’slow’ exponential matrix the the Nylström Approximation behave badly with
regard to the Rank-k truncated approximation if the exponential parameter q is small (with regards to 1).
Figure 1: Relative error with regard to the nuclear norm for the Nyström Approximation (solid lines) and
Rank k-Truncation (dashed lines) for Polynomial decay matrices
2
Figure 2: Relative error with regard to the nuclear norm for the Nyström Approximation (solid lines) and
Rank k-Truncation (dashed lines) for Exponential decay matrices
Figure 1 and 2 shows that there is room for improvement if we choose some k, and we expect it to be
filled when l increase. for this we compute a different relative error, which we call the k-rank relative error:
We fix k = 10 and n = 1024 and the plots are given in Figure 3 and 4. These plots shows that increasing l
make our algorithm more accurate. In particular we get that for small value l (for example 50) we already
achieve good approximation with a relative error of (less then 10−1 )
From the bound for accuracy for the Nyström Algrithm from lecture [1] we get that:
if l = O(k log nδ 2 ) then ∥A − [[AN yst ]]k ∥∗ ≤ 4 ∗ ∥A − [[A]]k ∥∗ holds with probability at least 1 − 2δ. We tried
to investigate theoretically the constant behind l = O(k(log nδ )2 ) by reading paper [4]. Yet it would seem
that computing l with Theorem 2.1 of [4] such that Ω is OSE( 31 , δ, k) and OSE( dϵ , Nδ , 1) is a problem for
p
’small’ value of n (that is of order less then 105 for example) since then l ≥ n is not usable.
Yet we will try to use l = O(k(log nδ )2 ) by trying different constant c such that l = c k(log nδ )2 . From Figure
3,4 we can estimate c by taking 50 = l = c k log( nδ )2 . We obtain c ≈ 20.
3
∥A − [[AN yst ]]k ∥∗
Figure 4: Plot of − 1 for k = 10, N = 1024. (Exponential decay matrices)
∥A − Ak ∥∗
4
5 Presentation of the runtime of the Randomized Nyström low
rank approximation (without parallelization)
We plot the runtime of our algorithm for R = 10, p = 1 and n = 1024, 4096 (R and p do not play a role
in the runtime) in Figure 5, we also plot a supposed constant line, which is the running time for the rank-k
truncated approximation.
Clearly if the runtime of our algorithm is bigger then the ones for simple k-rank truncation then there is
no advantage to use our algorithm since it will be less precise and slower. Our result seems independant of
k, this probably arise because my python implementation of k-rank truncation is not optimal, since I first
use the numpy.linalg.svd() function and then reduce the rank to k.
The intersection between the running time line of our Algorithm and the one from the rank-k truncation is
increasing (with regard to n) in proportion. Since k log( nδ )2 is decreasing in proportion with regard to n. We
can take l = c k log( nδ )2
References
[1] Laura Grigori Randomized algorithms for low rank matrix approximation, Lecture of the 31 October 2023.
[2] Laura Grigori Introduction to randomization and sketching techniques, Lecture of the 24 October 2023.