Professional Documents
Culture Documents
Thetalog
Thetalog
Alexander Barvinok
where B is an n × n positive definite matrix and h·, ·i is the standard scalar product
in Rn . We consider the problem of efficient computing (approximating) the sum
X X
(1.1.1) Θ(B) = e−f (x) = e−hBx,xi ,
x∈Zn x∈Zn
where Zn ⊂ Rn is the standard integer lattice. More generally, for a given point
y ∈ Rn , we want to efficiently compute (approximate) the sum
X X
(1.1.2) Θ(B, y) = e−f (x−y) = e−hB(x−y),x−yi .
x∈Zn x∈Zn
Typeset by AMS-TEX
1
Together with (1.1.1) and (1.1.2), we also compute the sum
X
(1.1.3) exp {−hBx, xi + ihb, xi} ,
x∈Zn
p
where kxk = hx, xi is the standard Euclidean norm in Rn .
Two algorithmic problems have been of considerable interest for quite some time.
One is finding the length of a shortest non-zero vector in Λ,
and the other is finding the distance from a given point y ∈ Rn to the lattice:
In the breakthrough paper [Ba93], Banaszczyk used theta series to sharpen struc-
tural results, “transference theorems”, relating, in particular, the length of a short-
est non-zero vector in Λ and the largest distance from a point y ∈ Rn to the dual
(reciprocal) lattice
−kx−yk2
P
− dist2 (y,Λ) x∈Λ e
(1.2.2) e ≤ P −kxk2
≤ 1,
x∈Λ e
so by computing (1.2.1), one can provide a lower bound for dist2 (y, Λ).
Another concept that turned out to be quite useful is that of the “discrete Gauss-
ian measure”, that is the probability measure on Λ defined by
2
e−kxk
P (x) = P −kuk2
for x ∈ Λ,
u∈Λ e
see [Ba93], [AR05], [MR07], [A+15], [RS17] for its applications and properties. Just
to compute P (x) for a single point x, we need to be able to compute (1.2.1). One
particularly useful inequality due to Banaszczyk [Ba93] states that
X 2 X 2
(1.2.3) e−kx−yk ≤ 5−n e−kxk for any y ∈ Rn .
x∈Λ:√ x∈Λ
kx−yk> πn
Choosing y = √ 0, we conclude from (1.2.3) that the bulk of the measure is concen-
trated within πn distance from 0.
Finally, we note that the shortest non-zero vector and nearest lattice point prob-
lems for general lattices in Rn reduce to those for lattices of the type Λ = L ∩ Zn+1 ,
where L ⊂ Rn+1 is a hyperplane spanned by integer points [S+11].
es −1
−s 2 −2s
2
I B π 2 s−1 I for some s ≥ 1.
π s+ 1−e 1−e
4
An easier to parse sufficient condition is
−1
es
2
π 2 s−1 I
(1.3.3) π s+ I B for s ≥ 3.
5
We note that any positive definite matrix B can be scaled B 7−→ αB so that (1.3.3)
is satisfied for some s. Hence via (1.2.2) one can get a lower bound (not necessarily
interesting) for dist(y, Λ) for arbitrary Λ ⊂ Rn and y ∈ Rn . Applying successive
conditioning on the coordinate affine subspaces, one can efficiently sample points
x ∈ Zn from the discrete Gaussian distribution associated with matrix B satisfying
(1.3.3), a question of independent interest, cf. [A+15]. It is not clear, however,
whether one can efficiently sample if B satisfies (1.3.2).
4
(1.4) Short integer vectors near a subspace. Given a proper subspace L ⊂
Rn , we are interested in finding out whether there are vectors x ∈ Zn \ {0} that
are reasonably short and also reasonably close to L. In analytic terms, when L is
defined by a system of homogeneous linear equations Ax = 0, we are interested in
non-trivial short integer “near solutions” x to the system or, equivalently, in small
integer “near linear dependencies” among the columns of matrix A, cf. Chapter 5
of [G+93] for related problems of Diophantine approximation.
Let us fix 0 < ω < 1 (all implied constants in the “O” notation in this section
depend on ω only). Given a proper subspace L ⊂ Rn , let us construct an n × n
positive definite matrix B = B(L, ω) as follows. The eigenvectors of B lie in L∪L⊥ ,
the eigenvectors in L all have eigenvalue ω ln n and the eigenvectors in L⊥ all have
eigenvalue ω ln n + 51 nω . Hence (1.3.2) is satisfied for all sufficiently large n with
s = ω ln n
Then
Θ(B) 1 ω 2
(1.4.2) = E exp − n dist (x, L) ,
Θ(sI) 5
where
dist(x, L) = min kx − yk
y∈L
(1.4.3)
2 1−ω
n 1−2ω
≥ 1 − 2 exp − +O n for all 0 < < 1.
2
5
see Lemma 4.1.
By computing the expectation (1.4.2) we can furnish a guarantee that there is a
reasonably short integer vector x 6= 0 that has a small angle
with L. Suppose, for
example, that the value of (1.4.2) is at least exp −αn1−ω for some 0 < α < 0.1,
which happens, for example, when
P (x ∈ L) ≥ exp −αn1−ω .
Then for every integer K > 0 the function G(t) = GA,b,s,K (t) defined by
n Y
K m
! !
−ktk2 /2
Y X
2k−1 4k−2
G(t) = e 1 + 2q cos βj + aij τi + q ,
j=1 k=1 i=1
where t = (τ1 , . . . , τm ) ,
2
is log-concave. In particular, the function FA,b,s (t)e−ktk /2
is log-concave.
7
We note that
∞ ∞
X q 2k−1 1 X
2k−1 q e−s
≤ q = = .
(1 − q 2k−1 )2 (1 − q)2 (1 − q)2 (1 − q 2 ) (1 − e−s )2 (1 − e−2s )
k=1 k=1
es
−s 2 −2s
sI B s+ 1−e 1−e I for some s ≥ 1,
4
es 2
1 − e−s 1 − e−2s .
kCkop ≤
4
Next, we write
1 T 1
C= A A so that B = sI + AT A
2 2
for an m × n matrix A. We can always choose m = n or m = rank A. Hence
1
≤ es 1 − e−s 1 − e−2s .
T
A A
op 2
8
Let q = e−s . For an integer K = K() > 0, to be specified in a moment, we define
F̃ : Rm −→ R by
n Y
K m
! !
Y X
F̃ (t) = 1 + 2q 2k−1 cos βj + aij τi + q 4k−2
j=1 k=1 i=1
for t = (τ1 , . . . , τm )
and use any of the efficient algorithms of integration log-concave functions to com-
pute
K Z
2k n 2
Y
−m/2
F̃ (t)e−ktk /2 dt
(2π) 1−q
k=1 Rm
∞ ∞ m
! !
2k n
Y Y X
1 + 2q 2k−1 cos βj + + q 4k−2
1−q and aij τi
k=1 k=1 i=1
Similarly,
∞ m
! !
X X
2k−1 4k−2
ln 1 + 2q cos βj + aij τi + q
k=K i=1
∞ ∞
X X
2k−1 4k−2 2k−1
≤ ln 1 − 2q +q = 2 ln 1 − q
k=K k=K
∞
X 4q 2K−1
≤4 q 2k−1 = ≤ 5q 2K−1 .
1 − q2
k=K
This is Jacobi’s triple product identity, see for example, Section 2.2 of [An98].
Suppose now that
wj ∈ C \ {0} for j = 1, . . . , n.
Then
n Y
Y
1 − q 2k 1 + wj q 2k−1 1 + wj−1 q 2k−1
j=1 k≥1
(3.1.1) n
kxk2 ξ
X Y
= q wj j .
x∈Zn : j=1
x=(ξ1 ,... ,ξn )
1 + wj (t)q 2k−1 1 + wj−1 (t)q 2k−1 = 1 + wj (t) + wj−1 (t) q 2k−1 + q 4k−2
m
!
X
= 1 + 2 cos βj + aij τi q 2k−1 + q 4k−2
i=1
and that
n Xn m n
ξ
Y X X
wj j = exp i βj ξ j + i τi aij ξj ,
j=1 j=1 i=1 j=1
we conclude that
∞
Y n
FA,b,s (t) 1 − q 2k
k=1
n m n
kxk2
X X X X
= q exp i βj ξ j + i τi aij ξj .
x∈Zn :
j=1 i=1 j=1
x=(ξ1 ,... ,ξn )
10
Since
2
Z +∞ n n
1 X
−τi2 /2
1 X
√ exp iτi aij ξj e dτi = exp − aij ξj ,
2π −∞
j=1
2
j=1
we get
∞ Z
2k n 2
Y
−m/2
FA,b,s (t)e−ktk /2
(2π) 1−q dt
k=1 Rm
2
m n n
X
kxk2
1 X X X
= q exp − aij ξj +i β j ξj
2
x∈Zn :
i=1 j=1 j=1
x=(ξ1 ,... ,ξn )
X
kxk2 1 2
X
= q exp − kAxk + ihb, xi = exp {−hBx, xi + ihb, xi} ,
2
x∈Zn n x∈Z
Proof. We have
d 2αq sin(ατ + β)
ln 1 + 2q cos(ατ + β) + q 2 = −
dτ 1 + 2q cos(ατ + β) + q 2
and
d2
ln 1 + 2q cos(ατ + β) + q 2
dτ 2
2
2α2 q cos(ατ + β) 1 + 2q cos(ατ + β) + q 2 + (2αq sin(ατ + β))
= − 2
(1 + 2q cos(ατ + β) + q 2 )
2α2 q cos(ατ + β)(1 + q 2 ) + 4α2 q 2
=− 2 .
(1 + 2q cos(ατ + β) + q 2 )
Now,
2 2
1 + 2q cos(ατ + β) + q 2 ≥ 1 − 2q + q 2 = (1 − q)4 .
Also,
2α2 q cos(ατ + β)(1 + q 2 ) + 4α2 q 2 ≥ −2α2 q(1 + q 2 ) + 4α2 q 2
= 2α2 q 2q − 1 − q 2 = −2α2 q(1 − q)2 .
is log-concave. Indeed, let g(τ ) be that restriction. From Lemma 3.3, we get
K n m
!2
d2 X q 2k−1 X X
ln g(τ ) ≤ − 1 + 2 2 aij γi
dτ 2 (1 − q 2k−1 ) j=1 i=1
k=1
K
X q 2k−1
≤ −1 + 2kAT k2op 2
k=1 (1 − q 2k−1 )
K
X q 2k−1
= −1 + 2kAT Akop 2 ≤ 0
2k−1 )
k=1 (1 − q
x∈Zn
x∈Zn k≥1
12
see Section 3.1.
We have
Y 2 X
1 − q 2k 1 + q 2k−1 = ln 1 − q 2k + 2 ln 1 + q 2k−1
ln
k≥1 k≥1
X X X
2q 2k−1 − q 2k
q 4k + q 4k−2
= +O
k≥1 k≥1 k≥1
q2 q4 q2
2q
= − +O +
1 − q2 1 − q2 1 − q4 1 − q4
= 2q + O(q 2 ),
Proof. We have
2 kxk2 /2 qn(1+)
P kxk ≥ 2qn(1 + ) =P e ≥ e
2
≤ e−qn(1+) E ekxk /2
Since
e/2 q ≤ e0.5 e−1 = e−0.5 < 0.9,
by Lemma 4.1 we have
X kxk2 n o
/2 /2 2
e q = exp 2e qn + O q n .
x∈Zn
and similarly X 2
q kxk = exp 2qn + O q 2 n .
x∈Zn
13
Hence n o
2
E ekxk /2
= exp 2qn e/2 − 1 + O q 2 n
and
n o
2 /2 2
P kxk ≥ 2qn(1 + ) ≤ exp qn 2e − 2 − (1 + ) + O q n .
Since
2
e/2 − 1 −
≤ for 0 ≤ ≤ 1,
2 4
the proof of the first inequality follows.
The second inequality is proved similarly. We have
2 −kxk2 /2 −qn(1−)
P kxk ≤ 2qn(1 − ) =P e ≥ e
2
≤ eqn(1−) E e−kxk /2
and hence
n o
P kxk2 ≤ 2qn(1 − ) ≤ exp qn 2e−/2 − 2 + (1 − ) + O q 2 n .
Since
2
e−/2 − 1 +
≤ for 0 ≤ ≤ 1,
2 4
the proof of the second inequality follows.
The following result is most certainly known, but we give its proof for complete-
ness.
(4.3) Theorem. Let A be an m × n integer matrix with rank A = m < n and let
L = ker A, L ⊂ Rn . Then
−1
dist(x, L) ≥ (kAkop ) for all x ∈ Zn \ L.
14
Since A is an integer matrix, x is an integer vector and Ax 6= 0, we have kAxk ≥ 1.
Let λ > 0 be the smallest eigenvalue of the matrix (AAT )−1 . Then
and hence
dist2 (x, L) ≥ λ.
On the other hand,
−1 −2
λ = kAAT kop = (kAkop ) ,
from which the proof follows.
References
[A+15] D. Aggarwal, D. Dadush, O. Regev and N. Stephens-Davidowitz, Solving the shortest
vector problem in 2n time via discrete Gaussian sampling (extended abstract), STOC’15–
Proceedings of the 2015 ACM Symposium on Theory of Computing, ACM, New York,
2015, pp. 733–742.
[AK91] D. Applegate and R. Kannan, Sampling and integration of near log-concave functions,
Proceedings of the 23rd Annual ACM Symposium on Theory of Computing, ACM, New
York, 1991, pp. 156–163.
[An98] G.E. Andrews, The Theory of Partitions. Reprint of the 1976 original, Cambridge Math-
ematical Library, Cambridge University Press, Cambridge, 1998.
[AR05] D. Aharonov and O. Regev, Lattice problems in NP ∩ coNP, Journal of the ACM 52
(2005), no. 5, 749–765.
[Ba93] W. Banaszczyk, New bounds in some transference theorems in the geometry of numbers,
Mathematische Annalen 296 (1993), no. 4, 625–635.
[BL61] R. Bellman and L. R. Sherman, The reciprocity formula for multidimensional theta
functions, Proceedings of the American Mathematical Society 12 (1961), 954–961.
[FK99] A. Frieze and R. Kannan, Log-Sobolev inequalities and sampling from log-concave dis-
tributions, The Annals of Applied Probability 9 (1999), no. 1, 14–26.
[F+94] A. Frieze, R. Kannan, and N. Polson, Sampling from log-concave distributions, The
Annals of Applied Probabability 4 (1994), no. 3, 812–837.
[G+93] M. Grötschel, L. Lovász and A. Schrijver, Geometric Algorithms and Combinatorial
Optimization. Second edition, Algorithms and Combinatorics, 2, Springer-Verlag, Berlin,
1993.
[LV07] L. Lovász and S. Vempala, The geometry of logconcave functions and sampling algo-
rithms, Random Structures & Algorithms 30 (2007), no. 3, 307–358.
[MR07] D. Micciancio and O. Regev, Worst-case to average-case reductions based on Gaussian
measures, SIAM Journal on Computing 37 (2007), no. 1, 267–302.
[M07a] D. Mumford, Tata Lectures on Theta. I. With the collaboration of C. Musili, M. Nori,
E. Previato and M. Stillman, Reprint of the 1983 edition, Modern Birkhäuser Classics,
Birkhäuser Boston, Inc., Boston, MA, 2007.
[M07b] D. Mumford, Tata Lectures on Theta. II. Jacobian theta functions and differential equa-
tions. With the collaboration of C. Musili, M. Nori, E. Previato, M. Stillman and H.
Umemura, Reprint of the 1984 original, Modern Birkhäuser Classics, Birkhäuser Boston,
Inc., Boston, MA, 2007.
[M07c] D. Mumford, Tata Lectures on Theta. III. With collaboration of Madhav Nori and Peter
Norman, Reprint of the 1991 original, Modern Birkhäuser Classics, Birkhäuser Boston
Inc., Boston, MA, 2007.
15
[RS17] O. Regev and N. Stephens-Davidowitz, An inequality for Gaussians on lattices, SIAM
Journal on Discrete Mathematics 31 (2017), no. 2, 749–757.
[S+11] N.J.A. Sloane, V.A. Vaishampayan and S.I.R. Costa, A note on projecting the cubic
lattice, Discrete & Computational Geometry 46 (2011), no. 3, 472–478.
16